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DECLARATION UNDER 37 C.F.R. §1.131 

As a below named inventor, I hereby declare that: 

1. My name is Dr. William Jack, Research Director for the DNA Enzymes 
Division at New England Biolabs Inc. My resume is attached. 

2. I have been studying the structure and function of DNA polymerases for 
over 16 years. 

3. 1 was a member of the group of scientists at New England Biolabs that 
isolated, characterized, and cloned the first hyperthermophilic archaeal DNA 
polymerase. Our continuing work with archaeon DNA polymerases identified 
a surprisingly homogeneous set of enzymes. We claimed this group of DNA 
polymerases in US Patent 5,500,363. In this patent, the United States 
Patent and Trademark Office recognized the validity of our claim to a class of 
archaeon DNA polymerases defined by the DNA encoding the enzyme and its 



ability to hybridize under defined conditions to various specified DNA 
sequences. The group was exemplified by T.litoralis (Vent), GBD (Deep 
Vent), and 9°N DNA Polymerases. 

4. We also found that this group of polymerases had a high degree of amino 
acid sequence identity. A comparative three-dimensional alignment of 
members of this group of enzymes showed a high degree of structural 
conservation, consistent with the observed high degree of primary amino 
acid sequence identity/similarity. See for example, Vent (Rodriguez, et al., 
2000), Tgo (Hopfner, et al., 1999), D. Tok (Zhao, et al., 1999), and KOD 
(Hashimoto, et al., 2001) DNA Polymerases. 

5. The structural equivalence of this group of polymerases is further 
supported by experiments reported in Example 10 of the above application 
in which we show that mutation of an analogous residue in Vent and 9°N 
DNA Polymerases yields enzymes with equivalent acyclonucleotide 
incorporation efficiencies. 

6. We discovered that this group of enzymes is capable of efficiently utilizing 
acyclonucleotides as substrates. We demonstrated this property using four 
examples of polymerases within this tightly defined group. Any molecular 
biologist of ordinary skill in the art would expect from these findings that this 
property would occur in all members of the enzyme group defined above. 

7. Additionally, my colleagues and I have published articles in peer reviewed 
journals discussing the physical basis for the preferential incorporation of 
acyclonucleotides, and also for the enhanced incorporation with Vent A488L 
and 9°N A485L DNA Polymerase mutants. See Gardner, et al. (2004) on 
page 11841, column 1, paragraph 2 and page 11841, column 2, paragraph 



1, respectively. 

8. 1 assert that the combination of the high degree of homogeneity in DNA 
and amino acid sequences of archaeon DNA polymerases, plus the structural 
evidence that modification of specific amino acids alters enzyme specificity, 
would be sufficient to assure a person of ordinary skill in the art that the 
class of polymerases as defined above will interact with acyclonucleotide 
substrates as shown in the above application. 

9. To further support the above statements, we have conducted additional 
experiments to confirm that archeon Family B polymerases with an amino 
acid sequence identity of greater than 30% can utilize acyclonucleotides as a 
substrate. This data is attached to the present declaration as appendix 1. 

9. 1 further declare under penalty of perjury pursuant to laws of the United 
States of America that the foregoing is true and correct and that the 
Declaration was executed by me on: 
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Appendix I 

We have purified and characterized the Family BDNA polymerase from 
the archaeon Methanococcus maripaludis, cloned from ATCC 43000. This 
polymerase has a 41% sequence identity and 63% sequence similarity with 
Vent DNA Polymerase when analyzed using NCBI Blast 2 and the default 
parameters. 

We performed the titration assay described in Example 1 of the patent 
application, using the Mma, Vent (exo-), and 9°N (exo+) DNA Polymerases. 
Experimental details and data are given in the attached figure. 

For each of the three polymerases, a comparison of lanes using 
dideoxyCTP (ddCTP) with those using equivalent concentrations of acycloCTP 
(acyCTP) reveals shorter products in lanes utilizing acyCTP. These shorter 
products result from more efficient insertion of the acyCTP terminator 
compared to incorporation of the ddCTP terminator. Thus, all three 
polymerases incorporated acyCTP more efficiently than ddCTP. 

Figure Legend 

The ability of acyNTPs and ddNTPs to act as chain terminators was 
tested using a titration assay of the type described in Example 1. 
Incorporation of ddCTP was compared to that of acyCTP, respectively, using 
Methanococcus maripaludis DNA polymerase, 9°N (exo+) DNA polymerase 
and Vent® (exo-) DNA polymerases. 

Incorporation of ddCTP and acyCTP was assayed by mixing 8 pi of 
reaction cocktail (0.025 uM 5' [FAM] end-labeled #1224-primed M13mpl8, 
62.5 mM NaCI, 12.5 mM Tris-HCI (pH 7.9 at 25°C), 12.5 mM MgCI 2 , 1.25 mM 



dithiothreitol, Methanococcus maripaludis DNA polymerase or 0.125 U/ul 
9°N (exo+) DNA polymerase or 0.125 U/ul Vent® (exo-) DNA polymerase) 
with 2 ul of 5X nucleotide analog/nucleotide solution to yield the final ratios 
of analog :dNTP indicated in the figures. After incubating at 72°C for 20 
minutes, the reactions were halted by the addition of 10 ul formamide. 
Samples were then heated at 72°C for 3 minutes and a 1 ul aliquot was 
loaded on a 4% polyacrylamide urea gel and detected by an ABI377 
automated DNA sequencer. 
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Comparative kinetic and structural analyses of a va- 
riety of polymerases have revealed both common and 
divergent elements of nucleotide discrimination. Al- 
though the parameters for dNTP incorporation by the 
hyperthermophilic archaeal Family B Vent DNA poly- 
merase are similar to those previously derived for Fam- 
ily A and B DNA polymerases, parameters for analog 
incorporation reveal alternative strategies for discrim- 
ination by this enzyme. Discrimination against ribo- 
nucleotides was characterized by a decrease in the af- 
finity of NTP binding and a lower rate of phosphoryl 
transfer, whereas discrimination against ddNTPs was 
almost exclusively due to a slower rate of phosphodi- 
ester bond formation. Unlike Family A DNA poly- 
merases, incorporation of 9-[(2-hydroxyethoxy)meth- 
y\\X triphosphates (where X is adenine, cytosine, 
guanine, or thymine; acyNTPs) by Vent DNA polymerase 
was enhanced over ddNTPs via a 50-fold increase in 
phosphoryl transfer rate. Furthermore, a mutant with 
increased propensity for nucleotide analog incorpora- 
tion (Vent* 4 * 81 - DNA polymerase) had unaltered dNTP 
incorporation while displaying enhanced nucleotide an- 
alog binding affinity and rates of phosphoryl transfer. 
Based on kinetic data and available structural informa- 
tion from other DNA polymerases, we propose active 
site models for dNTP, ddNTP, and acyNTP selection by 
hyperthermophilic archaeal DNA polymerases to ra- 
tionalize structural and functional differences between 
polymerases. 



All free living organisms encode several DNA polymerases 
that are jointly responsible for the replication and maintenance 
of their genomes, thereby ensuring accurate transmission of 
genetic information (1-3). The majority of identified DNA poly- 
merases can be classified into Families A, B, C, and Y according 
to amino acid sequence similarities to Escherichia coli poly- 
merases I, II, m, and IV/V, respectively (4, 5). Additional 
families have been identified, including the two-subunit repli- 
cative DNA polymerases from hyperthermophilic Archaea 
(Family D) (6) and eukaryotic DNA polymerase j3 and terminal 
transferases (Family X) (4). 

Structural and kinetic analyses of Family A (7-14) and Fam- 
ily B (15-25) DNA polymerases have increased the understand- 
ing of nucleotide selection and incorporation mechanisms. Al- 
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though amino acid sequences diverge between these two 
families, the structures of Family A and B DNA polymerases 
share recognizable finger, thumb, and palm subdomains that 
allow comparison of structural elements important for function 
(3, 11). In the case of Family A DNA polymerases from bacte- 
riophage T7, Escherichia coli (Klenow fragment, large frag- 
ment of DNA polymerase I), and Thermus aquaticus, as well as 
the Family B DNA polymerase from bacteriophage RB69, in- 
terpretation of the structural information is complemented by 
steady-state and pre-steady-state kinetic studies, allowing a 
detailed description of the polymerization pathway. Reaction 
parameters describing the mscrimination against naturally oc- 
curring nucleotide analogs encountered in vivo f such as NTPs, 
or unnatural nucleotide analogs, such as ddNTPs and dye- 
labeled ddNTPs (13, 25-30), have added insights into the basis 
for nucleotide discrimination. 

Hyperthermophilic archaeal DNA polymerases have not 
been scrutinized in such detail, hampering a complete charac- 
terization and comparison with other polymerases. Family B 
DNA polymerases from hyperthermophilic Archaea Thermo- 
coccus sp. 9°N (22), Thermococcus gorgonarius (18), and Pyo- 
coccus kodakaraensis KOD1 (24) and mesophilic bacteriophage 
RB69 (23) have high sequence and structural homologies and 
provide a framework for analysis of active site structure and 
function in this enzyme family (Fig. 1). Furthermore, steady- 
state kinetic studies have identified hypertheimophilic DNA 
polymerase residues important for polymerization and exonu- 
clease activities and for nucleotide binding (18, 29, 31-35). 
Nucleotide analogs have also been important in identifying 
dNTP recognition determinants important in the polymerase 
reaction (32-36) and have proven useful in a variety of molec- 
ular biology applications, such as DNA sequencing and detec- 
tion of single nucleotide polymorphisms (37-41). One group of 
analogs, 9- [(2-hydroxyethoxy)methyllX triphosphates (where X 
is adenine, cytosine, guanine, or thymine; acyNTPs), 1 is par- 
ticularly intriguing due to the wide spectrum of incorporation 
efficiency noted in different DNA polymerases, even within the 
same family of polymerase. For example, within Family B, the 
herpes simplex virus type 2 and human cytomegalovirus DNA 
polymerases incorporate acyNTPs more efficiently than 
ddNTPs, whereas human polymerase a more readily inserts 
ddNTPs over acyNTPs (42). Such differences have been ex- 
ploited in drug therapies where infective agents encode poly- 
merases that more readily insert acyNTP than does the host 
DNA polymerase (43). Hyperthermophilic archaeal DNA poly- 



1 The abbreviations used are: acyNTP, 9-[(2-hyoVoxyethoxy)methylJX 
triphosphate, where X is adenine, cytosine, guanine, or thymine; 
acyCTP, 9-[(2-hydroxyethoxy)methyl)]cytosine triphosphate; dCTPaS, 
2'-deoxycytidine 5'-0^1-thiotriphosphate); ddCTPaS, 2',3'-dideoxycyti- 
dine 5'-0-(l-thiotriphosphate); ROX-, 6-carboxy-X-rhodamine; FAM, 
6-carboxyfluorescein. 
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A. 




B. 

Region II Region III 

RB69 411 DLTSLYPSH 420 557 INRKLLtNSLY 567 

Vent 407 DFRSLYPSII 416 487 RAIKLLANSYY 497 

9°N 404 DFRSLYPSII 413 484 RAIKILANSFY 495 

KOD 404 DFRSLYPSII 413 484 RAIKl LANS YY 495 

TGO 404 DFRSLYPSII 413 484 RAIKILANSFY 495 

Fig. 1. Alignment of Family B DNA polymerase active sites. A, 

active site residues in hyperthermophilic Archaea Thermococcus sp. 
9°N {green; Protein Data Bank code 1QHT) (22), T. gorgonarius (TGO; 
red; code 1TGO) (18), and P. kodakaraensis KOD1 (KOD; blue; code 
1GCX) (24) are aligned with the apo-RB69 DNA polymerase (purple; 
code 1IH7) based on conserved region III amino acids using Deep 
View/SwissPdbViewer Version 3.7 using default settings (available at 
www.expasy.org/spdbv/) and rendered with Quanta software (Accelrys 
Inc., San Diego, CA). Structural deviations (root mean square devia- 
tions) between backbone atoms along the entire proteins are 1.87, 2.08, 
and 2.02 A, respectively, compared with KB69 DNA polymerase. (Don- 
served active site residues (RB69 DNA polymerase numbering) Lys 560 , 
Asn 564 , Tyr 567 , Tyr 416 , and Asp 411 are highlighted. Ala 486 is shown in 
green on 9°N DNA polymerase; the homologous residue is mutated to 
leucine in Vent A488L DNA polymerase. B, Family B active site residues 
from conserved regions II and HI (4) are aligned. 



merases (Vent®, Deep Vent™, 9°N™, and Pfu) all incorporate 
acyNTPs with greater efficiency than ddNTPs (33), in contrast 
with the behavior of Taq and Klenow fragment DNA poly- 
merases, which prefer ddNTPs (33, 44). 

In the course of probing the determinants of nucleotide sugar 
discrimination in the Family B DNA polymerase from the 
hyperthermophilic Archaea Thermococcus litoralis (Vent DNA 
polymerase), we identified a mutant (Vent A488L DNA polymer- 
ase) that reduces discrimination against several altered nucle- 
otides (32, 33). Subsequent crystal structures of closely related 
DNA polymerases strongly suggested that this residue makes 
neither direct nor indirect contacts with the reaction sub- 
strates, raising questions about the structural basis for the 
observed variation (Fig. IB). The universality of the A488L 
phenotype was later confirmed by homologous mutations in 
other hyperthermophilic DNA polymerases (Pfu A486Y DNA 
polymerase (34), 9°N A485L DNA polymerase (33), and Tsp 
JDF-3 A485T DNA polymerase (36)), further emphasizing a 
conserved role for this residue. 

Although instructive, these steady-state observations failed 
to address the underlying kinetic mechanisms responsible for 
nucleotide and nucleotide analog incorporation in hyperther- 
mophilic DNA polymerases. Therefore, we initiated pre-steady- 
state kinetic studies to compare the modes of nucleotide dis- 
crimination in Vent and other DNA polymerases. 



EXPERIMENTAL PROCEDURES 

Nucleotides, Nucleotide Analogs, DNA Substrate, and Enzymes— PJ\ 
DNA polymerases used in this study are 3' -* 5' exonuclease-deficient 
as a result of mutation of catalytic aspartic and glutamic acids to 
alanine in the exonuclease active site (31, 32, 45). These mutations 
prevent exonuclease removal of newly incorporated nucleotides or ter- 
minators. Vent and Vent A488L DNA polymerases were purified as de- 
scribed previously (31), and the concentration was determined spectro- 
scopically at 280 nm using an extinction coefficient of 115,960 liter 
mol" 1 cm" 1 . The concentration of E. coli DNA polymerase I (Klenow 
fragment exo"; New England Biolabs Inc., Beverly, MA) was calculated 
using a specific activity of 20,000 units/mg. dNTPs, ddCTP, and 9-[(2- 
hydroxyethoxy)methyl)]cytosine triphosphate (acyCTP) were from New 
England Biolabs Inc. 2'-Deoxycytidine 5'-0-(l-thiotriphosphate) 
(dCTPaS) and CTP were from Amersham Biosciences. 2',3'-Dideoxycy- 
tidine 5'-0-U-thiotriphosphate) (ddCTPaS) was from TriLink BioTech- 
nologies (San Diego, CA). 6-Carboxy-X-rhodamine (ROX)-derivatized 
nucleotide analogs ROX-ddCTP and ROX-acyCTP were kindly provided 
by Phil Buzby (PerkinElmer Life Sciences) (Fig. 2). Oligonucleotides 
used to measure 2'-deoxycytosine 5' -triphosphate (dCTP) and cytosine 
analog incorporation were synthesized and purified by the Oligonucleo- 
tide Synthesis Division at New England Biolabs with a 6-carboxyfluo- 
rescein (FAM) label on the primer strand for detection: 5'-FAM-CCC- 
TCGCAGCCGTCCAACCAACTCA-3' (25-mer) and 3'-GGGAGCGTCG- 
GCAGOTnXJTTGAGTGCCTCTTGTTT-5' (36-mer). 

FAM-duplex DNA was formed by mixing equimolar amounts of the 
dye-labeled 25-mer primer with the 36-mer template in annealing 
buffer (5 mM Tris-HCl (pH 8.0 at 20 *C), 5 mM NaCl, and 0.2 mi* EDTA) 
and heating the solutions to 95 °C for 5 min, followed by incubation for 
10 min at 60 °C and then cooling for 15 min at room temperature. 

Burst Kinetics and Active Site Titration— To measure the fraction of 
active Vent DNA polymerase and to determine the position of the 
rate-limiting step within the polymerase reaction pathway, we investi- 
gated whether the reaction followed burst kinetics. Rapid quench reac- 
tions were carried out as described below with 50 nM FAM-duplex DNA; 
10 or 20 nM Vent or Vent* 4881, DNA polymerase; and 0.20 mM dCTP, 
ddCTP, ddCTPaS, CTP, or acyCTP (final concentrations after mixing) 
in lx ThermoPol buffer (10 mM KC1, 20 mM Tris-HCl (pH 8.8 at 25 B C), 
10 mM (NH 4 )2S0 4 , 2 mM MgS0 4> and 0.1% Triton X-100). The steady- 
state rate (fc 2 ), the burst amplitude (A, which is equal to the active site 
concentration), and the initial rate of product formation (r, the burst 
rate) were extrapolated from the burst equation: [product] = A(l - 
exp" rt ) + (45). The steady-state turnover number (k ss ) was calcu- 
lated by dividing k 2 by A. 

Measurement of DNA Polymerase Pre-steady-state Kinetic Parame- 
ters — Single turnover nucleotide incorporation reactions were initiated 
by mixing Vent or Vent* 4881 - DNA polymerase (0.10 um) and FAM- 
duplex DNA (0.050 mm) in lx ThermoPol buffer together with an equal 
volume of nucleotides or nucleotide analogs in lx ThermoPol buffer. 
The reactions were allowed to proceed for the indicated times and then 
quenched by addition of EDTA to a final concentration of 0.35-0.40 m. 
Reactions in the range of 3 ms to 10 s were sampled using an RQF-3 
rapid quench ed-flow instrument (Kintek Corp., Austin, TX). Reactions 
with an initial time point > 10 s were mixed and quenched manually. All 
Vent DNA polymerase reactions were analyzed at 60 "C. Although this 
temperature is lower than the optimal reaction temperature of 72 °C 
(31), it is the highest temperature at which the rapid quench instru- 
ment can be operated reliably. Single turnover acyCTP incorporation by 
Klenow fragment DNA polymerase was initiated by mixing 1.0 uM 
Klenow fragment DNA polymerase (exo") and 0.10 um FAM-duplex 
DNA substrate in IX Klenow buffer (50 mM Tris-HCl (pH 7.5) and 2 mM 
MgCy with an equal volume of acyCTP in lx Klenow buffer. Reactions 
were then incubated at 25 °C for various times and quenched manually 
with EDTA (0.1 M final concentration). 

Conversion of the fluorescently labeled DNA primer-template to 
product was monitored by denaturing PAGE and automated fluores- 
cence detection methods. Product DNA was denatured by mixing a 7.5 
u\ aliquot of quenched sample with 45 /il of formamide and 1.5 mM 
EDTA and heating at 90 'C for 3 min. Fluorescent 5'-FAM-labeled 
25-mer oligonucleotide substrate and 5'-FAM-labeled 26-mer oligonu- 
cleotide product bands were fractionated by electrophoresis on an 8.8 M 
urea and 16% polyacrylamide denaturing gel using an ABI377 auto- 
mated sequencer (Applied Biosystems, Foster City, CA) and quantified 
using GeneScan Version 2.1 software (Applied Biosystems). The first- 
order rate constant for polymerase- catalyzed addition at each nucleo- 
tide concentration was calculated from a plot of lnlsubstrate] versus 
time. Rate constants (A obi ) were subsequently plotted as a function of 
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Fig. 2, Nucleotide and nucleotide analogs used for study of Vent DNA polymerase pre-steady-state kinetic reactions. The 

AW% ^Jw?**** ^ i 188 ^^ 0 " COn ?^£ or b j ndir £ were determined with the following nucleotides and nucleotide 
j J™ r!rl£^? S ' ^ U); nucleotlde tennmators ddCTP and acyCTP (5); and dye-substituted nucleotide terminators ROX-ddCTP 
ana KUX-acyCTP (C). 



nucleotide or analog concentration and fitted to the hyperbolic equa- 
tion: k obB = (fc^lnucleotideMJik + [nucleotide]), yielding k^, the 
maximum rate of nucleotide addition, and K Di the dissociation con- 



stant for nucleotide binding (46). The activation energy difference 
between dNTP and nucleotide analog incorporation was calculated by 
Equation 1 (47). 
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AAG* = " RT lnCC^pej/JToW^pd/^nudeoUda amdog) (Eq. D fa 

Single turnover kinetics require saturating enzyme concentrations. We 
established that 0.10 /iM Vent DNA polymerase was sufficient under 
the reaction conditions described by demonstrating that the rates of 
ddCTP incorporation were the same using Vent DNA polymerase con- 
centrations of 0.10, 0.20, and 0.40 /iM (data not shown). 

Measurement of Pyrophosphorolysis Catalyzed by Vent DNA Poly- 
merase— To measure the rate of DNA degradation by pyrophosphoroly- 
sis, Vent or Vent A488L DNA polymerase (0.10 /tM) was equilibrated with 
the DNA substrate (0.050 ftM) in lx ThermoPol buffer and then mixed 
with PPj in lx ThermoPol buffer at 60 °C using rapid quench tech- 
niques as described above. The extent of pyrophosphorolysis at each 
time point was calculated by multiplying the mole fraction of each DNA 
species by the number of phosphodiester bonds hydrolyzed to generate 
that species. and k vyn were derived using fitting protocols anal- 

ogous to those described above for nucleotide addition. 

RESULTS 

Analysis ofdNTP Incorporation by Vent DNA Polymerase— 
Previous studies with Family A DNA polymerases have shown 
that the steady-state rate-limiting step for addition of a single 
correctly paired dNTP follows phosphodiester bond formation 
(8, 13, 46, 48). Consequently, the first round of polymerization B. 
occurs more rapidly than subsequent rounds, resulting in a 
rapid initial burst of product. Incorporation of dCTP by Vent 
DNA polymerase displayed a burst pattern similar to those 
seen with RB69 and AmpliTaq-CS DNA polymerases, with a 
rapid burst (k buTSt = 60 s" 1 ) followed by slow steady-state 
turnover (k ss = 0.90 s" 1 ) (Fig. 3A and Table I). As indicated 
above, the burst is diagnostic for a rate-limiting step following 
bond formation; moreover, its amplitude is equal to the concen- 
tration of active enzyme, indicating that >90% of the Vent 
DNA polymerase preparation was active. Under similar condi- 
tions, Vent DNA polymerase failed to show a significant burst 
with ddCTP (Fig. 3B) or CTP (data not shown) incorporation. 
These data suggest that the rate-limiting step during nucleo- 
tide analog incorporation has changed compared with dNTP. 
Upon substitution of ddCTP with ddCTPaS, both Vent and 
Vent A488L DNA polymerases showed a 10- and 6-fold thio ele- 
mental effect (*burst(ddCTp/*buret(ddCTP«S))» respectively (Table 
I), consistent with an altered rate-limiting step. 

Determinations of K D and for dCTP addition by Vent 
DNA polymerase gave kinetic constants similar to those deter- 
mined for other DNA polymerases (Fig. 4A and Tables II and 
HI). The relatively high K D for nucleotides (K D = 70 /jlm) is 
similar to the K m for nucleotides determined in multiple turn- 
over steady-state measurements (K m = 40 jjlm) (31). Kinetic 
constants show little dependence on nucleotide identity, as 
similar Vent DNA polymerase binding (K D - 58 fm) and rate 
(k^ - 64 s" 1 ) constants were observed for dATP incorporation. 
Substitution of dCTP with dCTPaS had little effect on binding 
(K D ) or phosphodiester bond formation (k^); thus, the polym- 
erase displays a minimum thio elemental effect (ApoKdcrp/ 
*poi(dCTP«s) - 0.80) (Table II). 

Analysis of Vent DNA Polymerase-catalyzed Pyrophospho- 
rolysis — To examine Vent DNA polymerase pyrophosphorolysis 
activity, we monitored degradation of a FAM-labeled oligonu- 
cleotide duplex in the presence of increasing concentrations of 
PPj. The dependence of the rate of Vent DNA polymerase 
pyrophosphorolysis on PPj concentration yielded an equilib- 
rium dissociation constant for PPj binding of K D — 340 /jlm and 
a maximum velocity of k pyTO = 1.1 s" 1 (Table IV). 

Analysis of Ribonucleotide and Nucleotide Analog Incorpora- 
tion by Vent DNA Polymerase— -Kinetic parameters of ribonu- 
cleotide incorporation were determined to analyze the effect of 
the presence of a 2'-OH ribonucleotide on polymerization. Vent 
DNA polymerase discriminated strongly against CTP incorpo- 
ration via a 16-fold reduced binding affinity (K D - 1100 um) 




10 



0 I I I i In ,1 ,i 
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Time (seconds) 

Fig. 3. Pre-steady-state burst kinetics of dCTP and ddCTP 
incorporation by Vent DNA polymerase. Conversion of fluores- 
cently labeled substrate (25-mer) to product (26-mer) by 20 nM Vent 
DNA polymerase with 200 jiM dCTP (A) or ddCTP (B) was monitored as 
described under "Experimental Procedures." Product (nanomolar) for- 
mation is plotted versus time and fit to the burst equation: [ prod uct] = 
A(l - exp" rt ) + kj. In A, the Vent DNA polymerase dCTP burst 
amplitude (A) was 21 nM, the burst rate (r) was equal to 85 s~\ and the 
steady-state rate (kj was equal to 18 s~\ In fl, the first-order initial 
rate of ddCTP incorporation was 0.5 s~\ 

and a 400-fold slower rate of nucleotide addition (k^ - 0.160 
s" 1 ) (Table II). Comparison of CTP and dCTP parameters (ex- 
pressed as the ratio of catalytic efficiencies: ik^JK^Q^ik^J 
k d)ctp) revealed that Vent DNA polymerase preferred dCTP 
over CTP by 6000-fold. 

In contrast to CTP, discrimination by Vent DNA polymerase 
against ddCTP and acyCTP was almost exclusively due to a 
slower rate of nucleotide addition, with K D values for dCTP, 
ddCTP, and acyCTP being roughly equal (Fig. 4B and Table II). 
Indeed, the approximate 30-fold preference for acyCTP over 
ddCTP incorporation can almost entirely be attributed to steps 
measured by 

Similar experiments with Klenow fragment DNA polymer- 
ase showed a 32,000-fold higher discrimination against 
acyCTP, affecting steps measured by both K D and k^. The 
Klenow fragment DNA polymerase equilibrium binding con- 
stant for acyCTP was increased by 20-fold compared with dCTP 
and ddCTP, whereas for acyCTP incorporation was re- 
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Table I 
Pre-steady-state burst kinetics 

The kinetic parameters for Vent and Vent A488L DNA polymerases are from at least two independent determinations and are reported as the 
means ± S.D. ND, not determined. 

E " Zyme *bun=t<dCTP) *SS(dCTP) *bursUddCTP) *burs«ddCTP«S) 

" 7~* 7* T 1 

Vent 60 ± 40 0.90 ± 0.09 0.47 ± 0.05 0.044 ± 0.024 

Vent* 4881 - 45 ±9 0.10 ± 0.01 0.23 ± 0.02 0.039 ± 0.013 

RB69° 230 ± 40 2.7 ± 0.2 ND ND 

AmpliTaq-CS 6 50 ± 7 2.5 ±0.2 ND ND 

° Ref. 25. " ~~ 

b Ref. 13. 
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Fig. 4. Vent DNA polymerase pre-steady-state kinetics of nu- 
cleotide and nucleotide analog incorporation. The dependence of 
the reaction rate (&<*.) on nucleotide or nucleotide analog concentration 
was fit to a hyperbola according to the Michaelis-Menten equation: 
*oiw = (*poi [nucleotide] )f(K D + [nucleotide!), where k oU is the observed 
reaction rate, is the maximum rate of phosphodiester bond forma- 
tion, and K D is the equilibrium dissociation constant, as described 
under "Experimental Procedures." A, a fit of the data for dCTP incor- 
poration gave JT/xdcxp) = 74 /im and = 65 s~ l . B, a fit of the data for 
ddCTP single turnover gave K mAAOTPy = 37 /am and = 0.13 s" 1 . 

duced by > 1500-fold compared with dCTP (Table in). 

ROX-ddCTP and ROX-acyNTP Incorporation— Previous 
studies found ROX-derivatized ddCTP and acyCTP to be more 



efficient terminators than their unmodified forms when using 
Vent DNA polymerase (33). Pre-steady-state kinetics revealed 
higher binding affinities, but slower incorporation kinetics for 
the ROX derivatives (Table II), resulting in only marginal 
alterations in incorporation selectivity. 

Analysis of Enhanced Nucleotide Analog Incorporation by 
Vent A488L DNA Polymerase — We previously reported enhanced 
incorporation of nucleotide analogs by Vent* 4881 * DNA poly- 
merase (32, 33). In a burst kinetics experiment, the A488L 
mutant enzyme gave an initial burst of dCTP incorporation at 
a rate similar to that seen with the wild-type enzyme (^burst = 
45 s" 1 ; >75% active) (Table I). Moreover, the single turnover 
kinetic parameters for dCTP addition (K D = 77 jim and = 
56 s~ l ) were similar to values derived for the wild-type enzyme 
(Table II). However, following the initial turnover, the steady- 
state rate of the A488L mutant polymerase was 9-fold slower 
than that of the wild-type enzyme (k ss = 0.10 s" 1 ) (Table I), 
accounting for the lower specific activity of Vent A488L DNA 
polymerase (32). As with wild-type Vent DNA polymerase, 
replacement of dCTP with dCTPaS had little effect on K D or 
(Table II). Vent A488L DNA polymerase was less active in 
pyrophosphorolysis compared with the wild-type enzyme, the 
result of a 3-fold reduction in PPj binding affinity and a 2-fold 
decrease in k pyro (Table IV). Incorporation of CTP, ddCTP, 
acyCTP, and ROX-ddCTP by Vent* 4881 - DNA polymerase was 
more efficient (by 5-10-fold) compared with incorporation by 
the wild-type enzyme; in each case, this is attributable to both 
increased binding affinity (lower K D ) and faster reaction rates 
(*poi) (Table II). Incorporation of ROX-acyCTP was largely un- 
affected by the A488L mutation (Table II). 

DISCUSSION 

The fundamental information derived for Vent DNA polym- 
erase incorporation of dCTP confirms and expands earlier 
steady-state data (31) and places Vent DNA polymerase in the 
context of other Family A and B DNA polymerases. As with 
these other polymerases, the steady-state rate for single nucle- 
otide addition is limited by a slow step after phosphodiester 
bond formation. Previous steady-state measurements using an 
assay in which all four dNTPs were present gave a value of 
10 s" 1 . 2 This value is higher than the steady-state rate derived 
here in burst experiments (0.9 s" 1 ), most likely reflecting the 
higher temperature (72 °C) and, more importantly, the proces- 
sive synthesis allowed in the earlier studies. In contrast, the 
experimental design reported here forces the DNA polymerases 
to act in a distributive manner, Le. dissociating from the DNA 
before binding another primer-template and incorporating an- 
other nucleotide. 

The single turnover parameters for Vent DNA polymerase 
with the normal dCTP substrate are similar to those of other 
Family A and B polymerases, both mesophilic and thermo- 
philic. As shown in Table III, K D and values differ by 



2 H. Kong, H. M. R. B. Kucera, and W. E. Jack, unpublished data. 
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Table II 

Pre-steady-state kinetic constants for nucleotide and nucleotide analog incorporation by Vent and Vent A4SBZ ' DNA polymerases 
In almost all cases, the kinetic parameters for Vent and Vent A488L DNA polymerases are from at least two independent determinations (except 
where indicated by Footnote 6) and are reported as the means ± S.D. 



Vent DNA polymerase Vent* 4681 - DNA polymerase 



Nucleotide 


K D 




*po, 




Selectivity 0 








Selectivity 0 










iT'i 


r 1 






a' 1 


M- 1 8' 1 




dCTP 


70 ± 


7 


66 ± 1 


9.5 X 


10 6 




77 ± 9 


56 ± 3 


7.3 X 10 B 




dCTPaS 


120 ± 


40 


82 ± 13 


6.6 X 


10* 


1.4 


68 ±53 


28.0 ± 0.3 


4.1 X 10 5 


1.8 


CTP 


1100 ± 


100 


0.160 ± 0.005 


1.5 X 


10 2 


6000 


360 ± 30 


0.70 ± 0.20 


1.6 x 10 3 


450 


ddCTP 


46 ± 


7 


0.16 ± 0.01 


3.5 x 


10 3 


270 


18 ±6 


0.30 ± 0.02 


1.8 x 10 4 


40 


acyCTP 


81 ± 


25 


7.6 ± 1.1 


9.7 X 


10 4 


10 


24.0 ± 0.4 


13 ±2 


5.4 X 10 s 


1.4 


ROX-ddCTP 


10 ± 


2 


0.029 ± 0.003 


2.9 X 


10 3 


325 


6.0 6 


0.2° 


2.5 X 10 4 


30 


ROX-acyCTP 


8.0 ± 


0.2 


2.00 ± 0.01 


2.5 X 


10 s 


4 


6 fr 


1* 


1.6 X 10 5 


4.5 



a Selectivity between incorporation of dCTP and other NTPs is the ratio of the efficiency of dCTP incorporation ih^JKo) to the efficiency of CTP, 
ddCTP, or acyCTP incorporation. 
6 The kinetic parameters for Vent* 4881 - DNA polymerase are from single determinations. 



Table HI 

Pre-steady-state kinetic constants for nucleotide analog incorporation by DNA polymerases 



The kinetic parameters for Vent and Vent A488L DNA polymerases are from at least two independent determinations and are reported as the 
means ± S.D. ND, not determined. 



Enzyme 


dCTP 




CTP 






ddCTP 






acyCTP 




K D 


*pol 


K D 




Selectivity 0 






Selectivity 




Apol 


Selectivity 0 




fin 






s' 1 






8~ l 




V* 


8~ l 




Vent 


70 ±7 


66 ± 1 


1100 ± 100 


0.160 ± 0.005 


6000 


46 ±7 


0.16 ± 0.01 


270 


81 ±25 


7.6 ±1.1 


10 


Vent* 4881 - 
RB69 


77 ±9 


56 ±3 


360 ±30 


0.70 ± 0.20 


450 


18 ±6 


0.30 ± 0.02 


40 


24.0 ± 0.4 


13 ±2 


1.4 


69 ± 16* 


200 ± 13° 


16,000 ± 400° 


0.74 ± 0.2 6 


64,000* 


4300 ± 800* 


0.17 ± 0.02* 


73,000* 


ND 


ND 


ND 


Klenow 


9.6 ± 2.3* 


75 ± 13° 


21 ±7 C 


0.047 ± 0.025 e 


3400 c 


8.4 ± 4 d 


0.015 ± 0.004 d 


4200°" 


200 ±30 


0.048 ± 0.004 


32,000 


KlenTaq 


35 ±2* 


21 ±4* 


ND 


ND 


ND 


58 ±10* 


0.03 ± 0.003* 


1200* 


ND 


ND 


ND 



a Selectivity between incorporation of dCTP and other NTPs is the ratio of the efficiency of dCTP incorporation (k^K D ) to the efficiency of CTP, 
ddCTP, or acyCTP incorporation. 
* Ref. 30. 
c Ref. 26. 
d Ref. 27. 
*Ref. 13. 

Table IV 
Pyrophosphorolysis 



The kinetic parameters for Vent and Vent A488L DNA polymerases are from at least two independent determinations and are reported as the 
means ± S.D. 



Enzyme 


^ZXPPD 


kpyro 


*pyrt/K/XPPi) 


pol:pyro° 






8~* 


M->8- 1 




Vent 


340 ± 100 


1.10 ± 0.07 


3.2 X 10 3 


300 


Vent* 4881 - 


1100 ±400 


0.63 ± 0.21 


5.7 X 10 2 


1300 


RB69* 


26,000 


0.35 


1.3 X 10 1 


225,000 


Klenow* 


230 


0.31 


1.3 X 10 3 


6000 



° A comparison of DNA polymerase activity with pyrophosphorolysis (polrpyro) is given by the ratio of DNA polymerization efficiency ih^J 
KjxtsTP)) divided by the efficiency of pyrophosphorolysis (A py „/Kixppi))- 
T Ref. 25 
c Ref. 9. 



< 10-fold for all polymerases tested, with no clear division be- 
tween Family A and B DNA polymerases. Furthermore, the 
Family A Klenow fragment and Family B Vent and RB69 DNA 
polymerases carry out the reverse reaction of DNA polymeri- 
zation, pyrophosphorolysis, with similar rates (k pyro ), and Kle- 
now fragment and Vent DNA polymerases share comparable 
PPj binding constants (A^pp.)) (Table IV). Similarities in nu- 
cleotide incorporation kinetics and active site structure under- 
score the evolution of DNA polymerases to efficiently carry out 
DNA replication and repair. Significant kinetic differences be- 
tween the polymerases become apparent only when exa m ini n g 
nucleotide analog incorporation and elemental effects, as de- 
tailed below. 

Ribonucleotides— Despite a similar level of selectivity 
against NTPs, this discrimination is supplied almost exclu- 
sively by elements measured by for Klenow fragment DNA 
polymerase, whereas Vent DNA polymerase shows not only 
effects, but also a 16-fold weaker ground state binding of the 



nucleotide. RB69 DNA polymerase also shows effects in both 
K D and k^, achieving an even higher discrimination by virtue 
of a 230-fold weaker ground state binding. Discrimination 
against NTPs has, in large part, been attributed to a steric 
clash between the 2' -OH and a conserved side chain in the 
polymerase active site (26, 30, 32). The kinetic parameters 
suggest that the steric clash is first encountered in the ground 
state nucleotide binding by Vent and RB69 DNA polymerases, 
but does not affect Klenow fragment DNA polymerase until the 
transition state of the reaction. This could occur, for example, if 
the K D term for Klenow fragment DNA polymerase primarily 
measures binding prior to a conformational shift that engages 
the 2'-OH sensing machinery. 

Dideoxynucleotides — When incorporating ddCTP, RB69 
DNA polymerase discriminates at the level of both K D and k^. 
Discrimination by Vent, Klenow fragment, and KlenTaq (trun- 
cated Tag DNA polymerase with a 236^amino acid N-terminal 
deletion (13)) DNA polymerases is almost exclusively in the 
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steps measured by and not those involved in K Di with Vent 
DNA polymerase showing less discrimination than the other 
two polymerases. This parallel behavior appears to reflect a 
mutual lack of 3'-OH involvement in ground state substrate 
binding rather than a conserved set of nucleotide contacts. 

On the surface, the similarity in values for dNTP incor- 
poration by Vent and Klenow fragment DNA polymerases (10) 
suggests similar discriminatory mechanisms for these two en- 
zymes, a conclusion reinforced by the absence of an elemental 
effect with dNTPos using either enzyme. The simplest in- 
terpretation of the lack of an elemental effect with a-thio- 
substituted dNTPs with Klenow fragment and Vent DNA 
polymerases is that a non-chemical step(s) preceding 
phosphodiester bond formation is rate-limiting. Similarly, the 
lack of a significant elemental effect for Klenow fragment DNA 
polymerase incorporation of ddNTPos (12) argues that steps 
preceding phosphodiester bond formation continue to be rate- 
limiting for that enzyme. In contrast, the elemental effect noted 
for Vent and Vent A488L DNA polymerase incorporation of 
ddNTPos is an indication that the chemistry of phosphodiester 
bond formation significantly influences the rate-limiting step 
for these polymerases. 

The rates with both polymerases were significantly 
slower for ddNTPs than for dNTPs: 5000- and 400-fold for 
Klenow fragment and Vent DNA polymerases, respectively. In 
the case of Vent DNA polymerase, this must reflect at least a 
slowing of the chemical rate, whereas for Klenow fragment 
DNA polymerase, at least the rate of pre-chemical step(s) must 
be slowed. Thus, the pre-chemical rate for Vent DNA polymer- 
ase ddNTP incorporation is at least 10-fold faster than compa- 
rable steps for Klenow fragment DNA polymerase. 

Conserved amino acids positioned within either Family A or 
B DNA polymerase active sites probe for correctly base-paired 
substrates and concordantly align phosphates into a geometry 
required for phosphoryl transfer. As observed by Franklin et aL 
(23) and Yang et al (25) in the RB69 DNA polymerase ternary 
crystal structure (and by analogy, in the Vent DNA polymerase 
active site) (Fig. 1), the dNTP deoxyribose moiety assumes a 
favorable 3'-endo-sugar conformation. This conformation is 
constrained by hydrogen bonds between the 3' -OH and a main 
chain amide (corresponding to Vent DNA polymerase position 
412) and a non-bridging 0-phosphate oxygen (Fig. 5, A and B). 
Nucleotide a-, 0-, and "/-phosphates are further stabilized by 
direct or water-mediated hydrogen bonds with active site res- 
idues (Fig. 5, A and B). The absence of the 3'-OH on ddNTPs 
disrupts hydrogen bonding with the 0-phosphate (and main 
chain amide), potentially increasing the activation energy re- 
quired to orient the a-phosphate for phosphoryl transfer (Fig. 
5C). Indeed, the measured energetic difference between dNTP 
and ddNTP incorporation (15 kJ mol -1 ) is equivalent to that 
expected for the loss of at least two hydrogen bonds in the 
ddNTP transition state (47). 

Although the active site bonding network differs, in the 
Family A Klenow fragment DNA polymerase the dNTP 3'-OH 
contributes 21 kJ mol" 1 to transition state stabilization, ac- 
counting for inefficient ddNTP incorporation (27). This energy 
loss is counteracted in the closely related T7 DNA polymerase 
active site by a hydroxyl group at Tyr 526 (Klenow fragment 
DNA polymerase has Phe in the analogous position) that con- 
tributes a hydrogen bond to stabilize the ddNTP 0-phosphate 
in the transition state, re-establishing a hydrogen bonding 
network similar to interactions formed by dNTP (12). As a 
result, T7 DNA polymerase selectivity between dNTP and 
ddNTP is greatly reduced, as is the selectivity of the analogous 
Phe-to-Tyr mutation in both Klenow fragment and Taq DNA 
polymerases (49). 




Fig. 5. Active site models of dNTP, ddNTP, and acyNTP inter- 
actions. A, the RB69 DNA polymerase ternary crystal structure shows 
active site interactions that stabilize the substrate dTTP (23). Vent DNA 
polymerase numbering is shown in parentheses. B, a schematic of the 
Vent DNA polymerase active site interactions that stabilize the dNTP 
transition state is presented in a two-dimensional view. W, water mole- 
cule. C, a model of the Vent DNA polymerase active site with ddNTP 
bound reveals loss of hydrogen bonding with the Tyr 412 main chain amide, 
dTTP 3'-OH, and non-bridging 0-phosphate. D, a model for the binding of 
acyNTP suggests that, in the absence of a ribose ring, a molecule X (which 
could be a water molecule) can re-establish hydrogen bonding between the 
Tyr 412 main chain amide and non-bridging 0-phosphate. 
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Acyclonucleotides — Divergence between polymerases is also 
noted upon acyCTP addition, again suggesting divergent mech- 
anisms for nucleotide recognition and incorporation between 
the polymerases. Similar to ddNTPs, acyNTPs lack the 3'-OH 
required to establish a hydrogen bonding network between the 
main chain amide of Tyr 412 and 0-phosphate of the substrate 
(Fig. 5D). Klenow fragment DNA polymerase displays a strong 
discrimination in both K D and terms, resulting in a selec- 
tivity value of 32,000. In this case, the efficiency of acyNTP 
incorporation is nearly as low as for an incorrect base pair 
{kjJKjy = 240 and 160 m _1 s"\ respectively) (Table III) (50). A 
strong bias against acyNTP incorporation has also been noted 
for Taq DNA polymerase (33). 

In contrast, acyNTPs are incorporated by hyperthermophilic 
archaeal DNA polymerases with only 10-fold lower efficiency 
than dNTPs. By analogy with ddNTP incorporation by T7 DNA 
polymerase, it seems reasonable that the space normally occu- 
pied by the sugar 2'- and 3 '-carbons and associated substitu- 
ents would be accessible to water molecules, metals, or protein 
side chains that might establish interactions to compensate for 
those disrupted by the missing 3'-OH group. The difference in 
activation free energy between ddNTP and acyNTP incorpora- 
tion (AAG* = AG* ddNTP - AG^ntp = 10 kJ mol" x ) is equiv- 
alent to a gain of two additional hydrogen bonds, which could 
be provided by hydrogen bonding between the main chain 412 
amide, a putative water, and an acyNTP 0-phosphate non- 
bridging oxygen to mimic interactions that exist in the dNTP 
active site (Fig. 5D). At the same time, we cannot rule out 
stabilizing interactions arising from residues near the active 
site normally excluded by the ribose 2'- and 3 '-carbons that are 
absent in acyNTP. Clearly, three-dimensional structural anal- 
ysis will be necessary for a full understanding of the interac- 
tions important for Vent DNA polymerase incorporation 
of acyNTPs. 

Dye-substituted Nucleotides — Dye-substituted nucleotides 
have been useful in a variety of analytical applications (40, 41). 
Not surprisingly, given the diversity of dye structures and 
charges, dye-substituted nucleotides are accepted by DNA poly- 
merases with varying efficiencies (28, 29, 33, 36). Previous 
studies identified nucleotide derivatives containing the fluores- 
cent dye ROX as being more efficiently incorporated by Vent 
DNA polymerase than the parental nucleotides lacking the dye 
(29, 33). In the current kinetic studies, the magnitude of en- 
hanced dye-substituted terminator addition was much lower 
than previously estimated in semiquantitative gel titration 
assays, even though those same gel titration assays give good 
agreement with the relative incorporation efficiency of ddNTP 
and acyNTP substrates (33). ROX substitution of the nucleo- 
tide results in a 5-10-fold lower K D) suggesting that contacts in 
or adjacent to the Vent DNA polymerase active site stabilize 
dye binding. However, at the same time, is reduced, sug- 
gesting that one consequence of the enhanced binding is to slow 
nucleotide addition. Thus, substrate incorporation is a balance 
of both binding and catalysis: a substrate bound with too high 
affinity requires higher activation energy for efficient turnover 
by the polymerase. 

Vent* 4881 * DNA Polymerase Pre-steady-state Kinetics— Previ- 
ous studies identified a Vent DNA polymerase variant (A488L) 
with enhanced nucleotide analog incorporation properties (32, 
33). Correct dNTP incorporation by Vent* 4881- DNA polymerase 
is characterized by similar binding affinity (K D ), nucleotide 
transfer rate (k^X and rate-limiting step compared with Vent 
DNA polymerase, presumably reflecting the conservation of 
residues actively involved in coordinating the incoming dNTP. 
In contrast, each of the nucleotide analogs tested with 
Vent* 4881- DNA polymerase have higher binding affinity and 



faster rates of phosphoryl transfer than the unmodified polym- 
erase. Energy differences between Vent and Vent A488L DNA 
polymerase incorporation of ddNTP or acyNTP are modest 
(AAG* ddNTP = 4.5 kJ mol" 1 and AAG* acyNTP = 4.8 kJ moP 1 ), 
suggesting that subtle hydrophobic or hydrogen bond-mediated 
effects could account for enhanced analog incorporation. 

One hypothesis to account for these effects envisions the 
A488L variant as lying closer to the activated conformation, 
thus facilitating incorporation of analogs. The residue analo- 
gous to Ala 488 in the RB69 DNA polymerase crystal structure 
points away from the active site and lies at the interface be- 
tween the solid core of the polymerase and an a-helix that must 
make a 60° rotation to form the closed complex (Fig. 1). In Vent 
DNA polymerase, positioning a larger leucine residue at the 
position normally occupied by alanine in the a-helix may shift 
equilibrium from the open toward the closed conformation, 
thus reducing the activation energy for both binding and nu- 
cleotide transfer. This comes at a price: burst experiments 
demonstrate that subsequent turnover by the A488L variant is 
inhibited, perhaps reflecting hindrance of the transition from 
closed to open states required for release and/or binding of the 
template and dNTP. This proposal does not, however, easily 
account for the fact that pre-steady-state kinetics for the nat- 
ural substrates are unaltered in this variant. Alternatively, 
resolution of this discrepancy may lie in the greater ability of 
this variant to overcome distortions in the nucleotide-binding 
site, distortions that are not present when the normal nucleo- 
tide is bound. 

In summary, from these comparative studies, we observed 
that kinetics of dNTP incorporation pathways are conserved 
among Family A and B DNA polymerases despite diversity in 
primary amino acid sequence, thermostability, fidelity, and 
biological roles. However, differences in acyNTP and other 
nucleotide analog catalytic efficiencies in Klenow fragment, 
Vent, and other DNA polymerases illuminate fundamental dif- 
ferences underlying the kinetic pathway for DNA polymeriza- 
tion. As more DNA polymerases are studied kinetically, it is 
apparent that subtle structural variations in the active site 
influence how nucleotides are bound and positioned for 
catalysis. 
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The crystal structure of family B DNA polymerase from the hyperther- 
mophilic archaeon Pyrococcus kodakaraensis KOD1 (KOD DNA polymer- 
ase) was determined. KOD DNA polymerase exhibits the highest known 
extension rate, processivity and fidelity. We carried out the structural 
analysis of KOD DNA polymerase in order to clarify the mechanisms of 
those enzymatic features. Structural comparison of DNA polymerases 
from hyperthermophilic archaea highlighted the conformational differ- 
ence in Thumb domains. The Thumb domain of KOD DNA polymerase 
shows an "opened" conformation. The fingers subdomain possessed 
many basic residues at the side of the polymerase active site. The resi- 
dues are considered to be accessible to the incoming dNTP by electro- 
static interaction. A P-hairpin motif (residues 242-249) extends from the 
Exonuclease (Exo) domain as seen in the editing complex of the RB69 
DNA polymerase from bacteriophage RB69. Many arginine residues are 
located at the forked-point (the junction of the template-binding and edit- 
ing clefts) of KOD DNA polymerase, suggesting that the basic environ- 
ment is suitable for partitioning of the primer and template DNA duplex 
and for stabilizing the partially melted DNA structure in the high-tem- 
perature environments. The stabilization of the melted DNA structure at 
the forked-point may be correlated with the high PCR performance of 
KOD DNA polymerase, which is due to low error rate, high elongation 
rate and processivity. 

© 2001 Academic Press 

Keywords: archaea; crystal structure; family B DNA polymerase; "forked- 
point"; KOD DNA polymerase 



Introduction 

DNA polymerases are a group of enzymes that 
use single-stranded DNA as a template for the syn- 
thesis of the complementary DNA strand. These 
enzymes are multifunction, with both synthetic 
(polymerase) and one or two degradative modes 
(5'-3' and/or 3'-5' exonucleases) and play an essen- 
tial role in nucleic acid metabolism including the 
processes of DNA replication, repair and recombi- 
nation. Many DNA polymerase genes have been 
cloned and sequenced. Amino acid sequences 
deduced from their nucleotide sequences can be 
classified into four major types: Escherichia coli 
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DNA polymerase I (family A), E. coli DNA poly- 
merase II (family B), E. coli DNA polymerase III 
(family C) and others (family X). 1 Recently, a new 
family of DNA polymerases has been identified; all 
members of this family contain five highly con- 
served motifs, I-V, and several of these poly- 
merases participate in lesion bypass. 2 This family is 
called the UmuC/DinB family. 3 Family B DNA 
polymerases include eukaryotic DNA polymerase 
a, 5, and e, which are thought to be components of 
the replisome and to carry out chromosomal DNA 
replication. Archaeal proteins involved in gene 
expression, such as those for DNA replication, 
transcription, and translation, have been found to 
be similar to those from eucarya. Therefore, the 
archaeal system of gene expression is a simplified 
model of the eukaryotic system. In contrast, the 
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cellular appearance and organization of archaea 
are more like those of bacteria. 

The first crystal structure of a family B DNA 
polymerase to be obtained was that of bacterio- 
phage RB69 DNA polymerase (RB69 DNA poly- 
merase). 4 The first crystal structure of archaeal 
DNA polymerase was DNA polymerase from Ther- 
mococcus gorgonarius (Tgo DNA polymerase). 5 The 
editing complex of RB69 DNA polymerase has 
been reported, 6 two further crystal structures of 
archaeal family B DNA polymerases have recently 
been reported: Tok DNA polymerase from Desul- 
furococcus sp. Tok 7 is 9°N-7 DNA polymerase from 
Thermococcus sp. 9°N-7. 8 

The Pyrococcus kodakaraensis KOD1 is a 
hyperthermophilic archaeon, with an optimum 
growth temperature of 95 °C. 9 Enzymes produced 
in KOD1 were reported to be extremely thermo- 
stable and to have eukaryotic characteristics. 9 The 
optimum temperature of KOD DNA polymerase is 
75 °C similar to that of DNA polymerase obtained 
from Pyrococcus furiosus {Pfu DNA polymerase). 
KOD DNA polymerase, however, exhibits the 
higher extension rate (100-130 nucleotides/second) 
and processivity (>300 bases); five times and ten to 
15 times higher than those of Pfu DNA polymer- 
ase, respectively. 10 Thermostable DNA poly- 
merases are expected to be suitable enzymes for 
Polymerase Chain Reaction (PCR) KOD DNA 
polymerase is, therefore, suitable for DNA amplifi- 
cation by such means. Indeed, KOD DNA poly- 
merase is widely used in rapid and accurate PCR 
systems (TOYOBO Ltd., Japan). 

Although structures of three archaeal DNA poly- 
merases have been determined as described above, 
no structural information relating to elongation 
rate, processivity or fidelity is provided. We car- 
ried out the structural analysis of KOD DNA poly- 
merase in order to clarify the mechanism of 
enzymatic features of KOD DNA polymerase, 
which are the highest extension rate, processivity 
and fidelity. Here, we report the crystal structure 
of DNA polymerase from the hyperthermophilic 
archaeon Pyrococcus kodakaraensis KOD1. The three- 
dimensional structure of this KOD DNA polymer- 
ase may provide useful information to clarify the 
mechanisms for rapid and accurate reaction. In 
addition, this information may contribute to the 
improvement of the PCR properties of enzymes 
already in use such as thermostability, error rate, 
elongation rate and processivity, or for designing 
new enzymes for PCR as well as DNA replication 
by family B DNA polymerases. 

Results and Discussion 
Overall structure 

KOD DNA polymerase has a disk-like shape 
with dimensions 60 A x 80 A x 100 A and is 
made up of distinct domains and subdomains: 
N-terminal (N-ter: 1-130, 327-368, violet), Exo- 
nuclease (Exo: 131-326, blue), Polymerase (Pol) 
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domain including the Palm and Fingers subdo- 
mains (369-449, 500-587, brown; and 450^99, 
green, respectively) and the Thumb domain includ- 
ing thumb-1 and thumb-2 subdomains (588-774, 
red) (Figure 1(a)). The polymerase active site, con- 
taining three conserved carboxylates, (Asp404, 
Asp540 and Asp542) is located in an anti-parallel 
P-sheet in the Palm subdomain. The exonuclease 
active site contains two conserved carboxylates 
(Aspl41 and Glul43) and is located in an anti-par- 
allel P-sheet in the Exo domain. The Polymerase 
and exonuclease active sites on the molecular sur- 
face are indicated by P and E, respectively (see 
Figure 4). Structural comparisons of archaeal DNA 
polymerases (KOD, Tgo and 9°N-7 DNA poly- 
merases) are shown in Figure 1(b). The structural 
architectures of the proteins are identical, but the 
orientation of the domains and subdomains is 
different. In the case of the KOD DNA polymerase 
(red), the Thumb domain is shifted to make an 
"open" conformation and the portion of the Palm 
domain neighboring the root of the Thumb domain 
is slightly shifted as a result of the large movement 
of the Thumb domain in comparison to other 
archaeal DNA polymerases. Table 1 shows the 
averaged temperature factors of the domains and 
subdomains in the crystal structure of KOD DNA 
polymerase. The value of the Thumb domain was 
markedly higher than the others. The structures of 
many residues in the Thumb-2 subdomain are not 
defined, because the orientation of the subdomain 
is highly disordered. Therefore, it is thought that 
the structure of KOD DNA polymerase described 
here provides information for the DNA-free, most 
relaxed conformation. The structure of the editing 
complex of RB69 DNA polymerase revealed that 
newly synthesized duplex DNA is grasped by the 
Pol and Thumb domains. Although the orientation 
of the Thumb domain is potentially highly flexible, 
the orientation may be fixed when it binds to the 
primer-template duplex. 

Polymerase domain 

The Pol domain is made up of the Fingers and 
Palm subdomains and has an "Hike" shape 
(Figure 2(a)). The polymerization mechanism has 
been studied mainly on family A DNA poly- 
merases (Pol-I). A structural basis for a metal- 



Table 1. Averaged temperature factors 



Domain 


Temperature factor (A 2 ) 


N-ter 


38.1 


Exo 


55.7 


Pol 




Fingers 


495 


Palm 


52.8 


Thumb 


93.7 


Overall 


55.9 



Crystal Structure ofKOD DNA Polymerase 



471 




(b) Thumb-2 




Figure 1. (a) Overall structure of 
KOD DNA polymerase. The struc- 
ture is composed of domains and 
subdomains, which are N-terrninal 
(N-ter, violet), Exonuclease (Exo, 
blue), Polymerase (Pol) domain 
including the Palm (brown) and 
Fingers (green) subdomains and 
the Thumb domain (red), including 
the Thumb-1 and Thumb-2 subdo- 
mains. Conserved carboxylate 
residues in Polymerase and Exonu- 
clease active site are shown by ball- 
and-stick models, (b) Confor- 
mational comparison of Thumb 
domains among three archaeal 
DNA polymerases. Red, KOD 
DNA polymerase; blue, Tgo DNA 
polymerase; and green, 9°N-7 DNA 
polymerase. The comparison shows 
that the Thumb domain of KOD 
DNA polymerase displays the most 
"opened" conformation. 



assisted mechanism of phosphoryl transfer was 
provided by the bacteriophage T7 DNA replication 
complex. 11 The complex structure shows that two 
metal ions are bound by strictly conserved carbox- 
ylates (Asp475 and Asp654, which correspond to 
Asp404 and Asp542 in KOD DNA polymerase) 



extended from the anti-parallel P-sheet of the Palm 
domain. The phosphate group of mcoming ddGTP 
is held by the metal ions and the four basic resi- 
dues extending from the Fingers subdomain 
(His506, Arg518 and Lys522). The crystal structure 
of two ternary complexes of the large fragment of 
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(a) 



Thumb-2 



Fingers 



Thumb-1 



C428-C44 




R243 



R247 TR266 
"Forked point' 



R381 



R501 



Figure 2. (a) Ribbon represen- 
tation of the Pol domain. The 
domain is made up of Fingers and 
Palm subdomains. Conserved 
carboxylate residues (D404, D540 
and D542) are represented by ball* 
and-stick models. Basic residues 
are represented by ball-and-stick 
models, which stand in a line in 
the Fingers subdomain, facing the 
polymerase active site. K464 are 
replaced by alanine, because of the 
ambiguity of its electron density. 
Two disulfide bonds are displayed 
(C428-C442 and C506-C509). Aro- 
matic residues adjacent to a glycine 
residue, represented by ball-and- 
stick models, are localized in the 
joints of the subdomains. The 
Thumb domain is represented by 
the semitransparent model. The C* 
atoms of the nucleophilic residues, 
S407 (Pko Pol-1), S492 (77/ Pol-1) 
and T541 (77/ Pol-2), are rep- 
resented by violet spheres, (b) Exo- 
nuclease domains of KOD DNA 
polymerase and RB69 DNA poly- 
merase (semitransparent model). 
Conserved carboxylate (D141 and 
E143) and arginine residues (R243, 
R247, R265, R266, R343, R381 and 
R501) in the forked-point of KOD 
DNA polymerase are represented 
by ball-and-stick models. The red 
strands are p-hairpin motif parti- 
tioning template-binding and edit- 
ing clefts. The loop containing 
Phel52 is shown in orange. F123 of 
RB69 DNA polymerase and F152 of 
KOD DNA polymerase are rep- 
resented by semitransparent and 
opaque ball-and-stick models, 
respectively. 
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Thermus oqmticus DNA polymerase I (Klentaql) 
with a primer-template DNA and ddCTP have 
been reported. 12 The ternary complexes suggest 
that basic residues of the Fingers subdomain hold 
the phosphate group of the incoming dNTP and 
the domain induces a conformational change to 
deliver the incoming nucleotide to the active site. 
In the case of family B DNA polymerases, the Fin- 
gers subdomain is composed mainly of two long 
helices and does not have a joint that appeared in 
the structures of family A DNA polymerases. 
Therefore, it seems that in the case of archaeal 
DNA polymerases, the movement of the Pol 
domain to deliver dNTP to the active site differs 
from that of family A DNA polymerases. Kinetic 
study of RB69 DNA polymerase mutants revealed 
that four residues (Arg482, Lys486, Lys560 and 
Asn564) of the Fingers subdomain affected dNTP 
incorporation. 13 The residues are conserved in 
family B DNA polymerases, and correspond to 
Arg460, Lys464, Lys487 and Asn491 in KOD DNA 
polymerase, respectively. Furthermore, Lys468, 
Arg476, Lys477 and Arg484 are located at die tip 
of the Fingers subdomain on the side of the poly- 
merase active site in KOD DNA polymerase 
(Figure 2(a)). It is expected that the "queue" of 
basic residues captures the incoming dNTPs, then 
the dNTP is delivered toward the polymerase 
active-site center by accompanying the movement 
of the polymerase domain. Two disulfide bonds 
exist in the connection site between the Palm and 
Fingers subdomains (Figure 2(a); Cys428-Cys442 
and Cys506-Cys509). The two disulfide bonds are 
found also in the crystal structures of Tgo, Tok and 
9°N-7 DNA polymerases. Sequence alignment for 
archaeal DNA polymerases is shown in Figure 3, 
suggesting the potential for the formation of disul- 
fide bonds in the same sites. It is thought that the 
disulfide bonds are required to maintain the struc- 
ture of the Fingers and Palm subdomains at extre- 
mely high temperatures. Sequence comparison 
suggests that the number of disulfide bonds are 
correlated with optimum growth temperatures of 
organisms. DNA polymerases from Thermococcus 
Utoralis, Methanococcus jannaschii and Archaeoglobus 
fulgidus, with optimum growth temperatures of 85, 
85 and 83 °C, respectively, are expected to have 
one disulfide bond, because Cys506 is replaced by 
serine in T. Utoralis and M. Jannaschii, and Cys442 
is replaced by arginine in A. fulgidus. DNA poly- 
merase from Methanobacterium thermoautotrophicum, 
with an optimum growth temperature of 65 °C, is 
expected to have no disulfide bond, because 
Cys428, Cys442 and Cys506 are replaced by gluta- 
mic acid, arginine and serine, respectively. 

Archaeal DNA polymerases have characteristic 
sequences of aromatic residues adjacent to glycine 
residues (Figure 3). These are localized at the 
hinges of the Palm subdomain at the connections 
to the Fingers and Thumb-1 subdomains 
(Figure 2(a)). These aromatic residues may provide 
a flexible aromatic environment because of the 
adjoining glycine residues. This may contribute 



to the conformational changes of Pol domain in 
polymerization. 

The 3'-5' exonuclease domain 

DNA is synthesized by competition between the 
rate of polymerase and exonuclease activities at the 
newly synthesized 3' terminus from the primer. 
Misincorporation of a nucleotide destabilizes the 
structure of duplex DNA at the 3' terminus of the 
primer. This decreases the rate of nucleophilic 
attack on the ot-phosphate group of the mcoming 
dNTP by the primer 3'-OH and allows excision of 
the incorrect nucleotide by the proofreading exonu- 
clease. The excision requires the movement of the 
3' tenriinus to the exonuclease active site 
accompanied by rewinding of the duplex DNA, 
because the exonuclease active site is set apart 
from the polymerase active site. In KOD DNA 
polymerase, the exonuclease active site is set apart 
from the polymerase active site by approximately 
40 A. The editing complex of RB69 DNA polymer- 
ase shows structural similarity to the editing mode 
of family B DNA polymerase. 6 The DNA polymer- 
ase binds the mismatched primer-template DNA, 
which is partially denatured; the 3' end of the pri- 
mer strand is bound at the exonuclease site. Resi- 
dues 251-262 of RB69 DNA polymerase, that form 
an extended P-hairpin structure that juts directly 
out from the protein surface and projects into the 
DNA, stabilize the partially denatured or melted 
structure. Arg260 extending from the P-hairpin 
motif plays an important role. Arg260 and Phel23 
appear to block the template strand by making 
interactions with the penultimate base at the 3' end 
of the primer-template. Arg260 and Phel23 in 
RB69 DNA polymerase correspond to Arg247 and 
Phel52, in KOD DNA polymerase respectively. 
Figure 2(b) shows the structural comparison of Exo 
domains of KOD and RB69 DNA polymerases. 
Molecular surface and electrostatic potentials are 
shown in Figure 4. The P-hairpin motif in KOD 
DNA polymerase corresponds to residues 242-249 
and Arg247, extending to the forked-point, which 
is the junction of the template-binding and editing 
clefts (T-cleft and E-cleft, respectively) (Figure 4). It 
seems that Arg247 can separate template strand 
from primer strand and stabilize the melted struc- 
tures of the strands in a manner similar to that of 
the RB69 DNA polymerase. As Phel52 is set apart 
from the active site, it is apparently unable to 
make an aromatic interaction with the base of the 
primer. Based on the above idea, the movement of 
the loop including Phel52 (Figure 2(b)) is required 
to interact with the primer bound at the E-cleft. 
Furthermore, Arg243 extends from the P-hairpin 
structure to the T-cleft. Arg243 interacts with the 
template strand to fix it at the T<left. In addition 
to Arg243 and Arg247, five arginine residues gath- 
er at the forked-point in KOD DNA polymerase 
(Arg265, Arg266, Arg346, Arg381 and Arg501) and 
provide a basic environment (Figures 2(b) and 4). 
It seems that they can interact with the phosphate 



474 



Crystal Structure of KOD DNA Polymerase 



Pko 

Pfu 

Til 

Mja 

Afu 

Mth 



Pko 

Pfu 

Til 

Mja 

Afu 

Mth 



Pko 

Pfu 

VI 

Mia 

Afu 

Mth 



140 

A £Ql$GiT L 
A F » ; T L 
A B % I; T F ♦ 
AF • M = V Y - 



S 



Vfe» C j ML - 
MUlilElV R - 



250 

IMGDIDF A V E V KG R 
1 1 GDMTAV" 
IMG D S F A V 
jGGME Y RS 

- r n \> G - - - - R P i\ i;,y i van u. m w :> 

- T MQR GF AN AT Al^ K GiT V)H V I 



260 



270 



Exo 



340 

s l wpy s 
p. l w dfln 
s v wo|y 

T P F jEi'I^T" 



as s 

M 




S DIK I K 
Y MNfeO R Y 



PLFllliHM^. . FSSRRGRRAV 



380 390 
L A RR01- QS Y ZMtiMMK^- 
YORRLRESYT WMmmMh 



400 

- «MV Yl| 




YL 

vug 



Pko PoM 410 
? Rf 

ate ' 



S M • Ff RQL 
V G 



Pol 



420 

HiN|Vp|BiC^lL NLEG 
H N >^SMOp|L EKEG 
Y N IwS ^DrT L D CECL 
F nmmf^pf GCR-DOCYEAPE 
KM^SllibigL TDDEESECYVAPEYGI? 



450 



KsD F P 0§§l PSL 1G D**f|f E E 
WO\ P&m P S Ljff0H«H E 
I^D F P tmt P S I JIG d®1a M 

K K 6 Mil ' P K T MR NMti 
*SPO<J^F KRI ^RMSil! 



460 



430 440 

K E Y 0 V A P Q V:G'H|R^0 
K N Y D I A P Q V .G:;H<K Jv« 
KN Y OV API V;GllRi 
KDVSEK- I LG^WEiC 

^R I^R KCS P RGF?V P S VfeGElS#S eHv R II 



E R 
E K 



Q K II 
OKI 

°m 

HR E$ 



Pftj 
to 

Mja 
Afu 
Mth 



Pko 

Pfu 

Til 

Mja 

Afu 

Mth 



470 

3k kmQa t i dp--- 
J tkm Jet qdp- - - 

! K K;M ! ST I DP - - - 
J R R M J K M A E I GEI 
5 VEf NLSPE- - - 
9e E MHO s dd p - - - 




480 



l^L DlY R QRlAI 

i t i^d;y rq{3ai 

EaipilYR^niAg 



iffc'fN'V Q Q E A linR fflfe 



490 TH P ^ 
L L A1N S 



A R|W^|3K 



510 
JA E - 

mm- 
mm- 

til- 




540 



kv i^sraiasiaaQ 



n/Poi-2 



pet i 




k s v:l : :y;a 

KV L Y I 
K V t Y G 




E^EF^L 
Oil; E V D E 
E L e!f :l 



Figure 3. Sequence alignment of archaeal DNA polymerases. The abbreviations used as follows: Pko, Pyrococcus 
kodokaroensis; Pfu, Pyrococcus furiosus; Tli, Thermococcus litoralis; Mja, Methanococcus jannaschii; Afu, Archaeoglobus fulgi- 
dus; and Mth, Methanobacterium thermoautotrophicum. Homologous residues are masked in gray. Remarkable residues 
are highlighted in reverse type. Conserved carboxylate residues in the Exonudease and Polymerase active sites are 
shown in red. Basic residues gathering in the forked-point and Fingers subdomain are shown in blue. R243, R247, 
R255, R266, R346, R381 and R501 are located in the forked-point R460, K464, K468, R476, K477, R484 and K487 are 
located in the Fingers subdomain and face into the polymerase active site. Cysteine residues forming (or possibly 
forming) disulfide bonds are shown in green. Nudeophilic residues in self-splicing reaction are shown in violet. 
Inteins intervene before the nudeophilic residues. Aromatic residues adjacent to glycines are shown in orange. 



groups of the DNA strand and stabilize the melted 
structure of DNA strands at the forked-point. 
Several arginine residues at the forked-point are 
conserved in known family B DNA polymerases 
from hyperthermophilic archaea. 

In DNA synthesis, the structure of DNA is vari- 
able at the stage of switching between the 
elongation and editing modes. Hypethermophiles 
must have mechanisms to protect their genomic 
DNA against thermal denaturation. The genomic 
DNA of hyperthermophilic archaea have nucleo- 
some-like structures brought about by interaction 
with histone-like proteins. 14 Nevertheless, at the 
replication fork, the DNA strands are exposed. 
Therefore, DNA polymerases of hyperthermophilic 
archaea are required to stabilize the exposed or 
melted DNA structure in the high temperature 



environment. The stabilization by DNA polymerse 
may correlate with the enzymatic characteristics of 
DNA polymerase such as half-life period of 
activity, error rate, elongation rate and proces- 
sivity. As discussed above, it is considered that 
the arginine residues around the "forked-point" 
have a remarkable effect on the stability of DNA 
structure. In the forked-point of Pfu DNA polymer- 
ase, Arg247, Arg265 and Arg501 are replaced by 
methionine, threonine and lysine, respectively. 
Therefore, the replacements may affect the 
difference of the enzymatic characteristics between 
KOD and Pfu DNA polymerases. Additional exper- 
iments such as site-directed mutagenesis, a 
together with enzymatic studies of DNA 
polymerases are necessary to clarify the role of the 
residues at the forked-point. 
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Figure 4. Molecular surface with electrostatic potential 
map around the forked-point The red and blue surfaces 
are acidic and basic regions, respectively. Domains and 
subdomains are labeled with orange letters. Polymerase 
and Exonuclease active sites are labeled with P and E, 
respectively. The ^hairpin is labeled with p. 



Extein connection site 

The KOD DNA polymerase gene encodes a 1671 
amino acid residues precursor protein. The precur- 
sor protein is processed precisely into three parts 
by protein splicing. The self-splicing reaction yields 
the mature KOD polymerase (774 residues and 
two intervening protein domains (termed inteins), 
Pl-Ptol (360 residues) and Pl-PkoU (537 residues) 
as a result of the ligation of the external N and 
C-terminal domains (termed extein). 10 ' 15 All known 
precursor proteins contain conserved amino acids 
at self-splicing sites: serine, threonine or cysteine 
(nuclepphiles) at the intein N terminus, and 
His-Asn pair at the intein C terminus followed by 
serine, threonine, cysteine (nucleophiles) at the 
C-extein N-terminus. 16 The traces of the protein 
splicing reaction in KOD DNA polymerase are 
Ser407 and Ser492, which were located in at the N 
terminus of the C-extein. In the crystal structure of 
KOD DNA polymerase, the nucleophilic residues 
are found in the Pol domain (Figure 2(a)). 

Self-splicing sites in archaeal family B DNA 
polymerases (a family) are classified into three 
types: Pko Pol-1, Tli Pol-1 and Tli PoI-2 (The 
Intein Database, http: / / www.neb.com/neb / 
inteins.html). The nucleophilic residues, serine or 
threonine, in the three sites are mapped in 
Figure 2(a). In the case of KOD DNA polymerase, 
PI-PM intervenes in the Pko Pol-1 site and PI-Pfa)II 
intervenes in the Tli Pol-1 site. The structure shows 
that they are localized around the polymerase 
active site in the Palm domain. Although they are 
exposed to solvent, they are surrounded by the 



Fingers subdomain and the Thumb domain. The 
two inteins cannot exist in the space because of 
steric hindrance. Therefore, it is necessary that the 
folding of inteins and the subsequent self-excisions 
are carried out before the extein is folded. 

Materials and Methods 

Crystallization 

KOD DNA polymerase was overexpressed in E. coli 
BL21(DE3) and purified by the previously reported 
method. 10 The crystals of KOD DNA polymerase were 
grown by the previously reported method/ 7 . KOD DNA 
polymerase was concentrated up to about an nm of 
25. Crystals of KOD DNA polymerase suitable for dif- 
fraction experiments were obtained at 293 K with hang- 
ing drops of 2 ul of protein solution and 2 |jJ of reservoir 
solution containing 100 mM sodium citrate buffer 
(pH 5.5) and 25 - 30% (v/v) 2-me%l-2,4-pentanediol 
(MFD), equilibrated against the reservoir solution. 

Data collection 

X-ray diffraction measurements were performed at the 
beamline 18B of the Photon Factory at the High Energy 
Accelerator Research Organization, Tsukuba Science 
City, Japan. Each crystal of KOD DNA polymerase was 
picked up directly with a nylon fiber loop from a drop 
of mother liquid; the crystal was then rapidly transferred 
to the N 2 gas stream. The incident beam with wave- 
length of 1.00 A was collimated to 0.2 mm in diameter. 
Intensity data were collected on 200 mm x 400 mm ima- 
ging plates (Fuji Film Company Ltd.) using the Weissen- 
berg camera for macromolecules with a radius of 
430 mm 18 - 19 and the oscillation method with 3° rotation 
per frame. The crystals diffracted at least to 2.8 A resol- 
ution at 100 K X-ray diffraction data were processed 
and scaled with programs DENZO and SCALEPACK. 20 
The diffraction data were scaled with zero cr cutoff. 
Unit-cell parameters were determined as a = 111.9 A, 
b = 112.4 A and c = 73.9A with the space group of 
P2 1 2,2 1 . The unit-cell parameters gave Matthew's coeffi- 
cient of 2.60 A^Da- 1 and a solvent content of 52.2% 
(v/v). 21 The final completeness of the data consisted of 
119,205 measurements of 20,298 unique observed reflec- 
tions with an overall of 8.4% and 34.5% in the 
outermost resolution shell (2.90-2.80 A). This represents 
88.1% of theoretically observable reflections at 2.8 A 
resolution. The outermost resolution shell of data is 
83.7% complete. 

Structure determination 

The crystal structure of KOD DNA polymerase was 
solved by molecular replacement with the AMoRe 
program. 22 The structure of Tgo DNA polymerase (PDB 
code 1TGO) reduced to polyalanine was used as the 
search model Data in the resolution range of 20.0-3.5 A 
were used in both the rotation and translation functions. 
Results are discussed in terms of the AMoRe correlation 
coefficient (CC). Using a Patterson cut-off radius of 36 A, 
a list of 20 rotation function peaks was obtained, with 
the top peak having an AMoRe CC value of 13.8. The 
top solution by translation function is CC of 43.3 with an 
R-factor of 54.1 %. At this stage, the electron density of 
the Thumb domain is very ambiguous. Therefore, struc- 
tural refinement of the initial stage was carried out with 
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Figure 5. The final 2F D - F c map around the Fingers 
and Palm subdomains. The map is contoured at 1 a. 



a model omitting the Thumb domain. The model was 
manually modified using the program O 23 and subjected 
to further rounds of refinement using data in the resol- 
ution range 40.0-3.0 A with the program CNS. 24 The 
final R-factor is 23.1% and Rf^ is 31.3%, with r.ms. 
deviations for bond lengths and bond angles being 
0.007 A and 1.1 °, respectively. The 50 residues at the tip 
of one Thumb domain are not included in the final 
model due to poorly defined electron density. Figure 5 
shows the final 2F D - F c map superimposed on the 
refined final coordinates of KOD DNA polymerase. 



Protein Data Bank accession code 

Refined coordinates and structure factor have been 
deposited in the RCSB Protein Data Bank under the 
accession code 1GCX. 



Figure preparation 

Figures 1 and 2 were prepared using programs 
MOLSCRHT 25 and Raster3D/ 6 - zf Figure 4 was prepared 
by GRASP. 28 Figure 5 was prepared using the program 
O. 23 
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ABSTRACT Most known archaeal DNA polymerases be- 
long to the type B family, which also includes the DNA 
replication polymerases of eukaryotes, but maintain high 
fidelity at extreme conditions. We describe here the 2.5 k 
resolution crystal structure of a DNA polymerase from the 
Archaea Thermococcus gorgonarius and identify structural 
features of the fold and the active site that are likely respon- 
sible for its thermostable function. Comparison with the 
mesophilic B type DNA polymerase gp43 of the bacteriophage 
RB69 highlights thermophilic adaptations, which include the 
presence of two disulfide bonds and an enhanced electrostatic 
complementarity at the DNA-protein interface. In contrast to 
gp43, several loops in the exonuclease and thumb domains are 
more closely packed; this apparently blocks primer binding to 
the exonuclease active site. A physiological role of this 
"closed" conformation is unknown but may represent a poly- 
merase mode, in contrast to an editing mode with an open 
exonuclease site. This archaeal B DNA polymerase structure 
provides a starting point for structure-based design of poly- 
merases or ligands with applications in biotechnology and the 
development of antiviral or anticancer agents. 

Propagation of cells requires faithful DNA replication. This is 
performed in vivo by DNA polymerases (pols), which attach 
the appropriate dNTP to the nascent DNA primer strand to 
match its paired template. Different families of pols are 
involved in different DNA polymerization processes including 
not only DNA replication (1, 2) but also repair and recombi- 
nation (3, 4), a heterogeneity also reflected by varying 
polypeptide structures and/or subunit compositions (3, 5). 
Some pols complement polymerase activity with 3' 5' 
exonuclease activity (editing activity) and/or 5' -> 3' "struc- 
ture-specific endonuclease" activity, often located in separate 
structural domains on the same polypeptide chain (4-8). 

Crystal structures are available for most known polymerase 
families, including the A family DNA polymerases (9-14), pol 
0 (15-17), HIV reverse transcriptase (18-20), and recently, 
the B family pol gp43 from bacteriophage RB69 (21). All share 
a functional polymerase structure, which resembles a right 
hand built by the palm, ringers and thumb domains (see ref. 7 
for review). Although the fingers and thumb domains are 
highly diverse among the different families, the palm domains, 
which contain the conserved catalytic aspartate residues, show 
a similar topology among all families except pol 0. The 
polymerase nucleotidyl transfer was studied in detail for the A 
family polymerases, HIV reverse transcriptase, and pol 0, and 
was shown to involve two metal ions (summarized in ref. 7). 

Considerably less is known for the family of type B pols, 
which are replicative enzymes in eukaryotes and most likely 
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also Archaea (22, 23). The structure of gp43 from bacterio- 
phage RB69 (21) provided an excellent first insight into this 
family. In addition to the three polymerase domains, gp43 
contains an 3' -» 5' exonuclease domain and an N-terminal 
domain. The exonuclease and palm domains share the topol- 
ogy and active site of A family enzymes, implying similar 
metal-assisted mechanisms for polymerase and exonuclease 
activities (21). The thumb and finger domains are apparently 
unrelated to the other polymerase families. The function of the 
N-terminal domain remains unknown, but may help assemble 
the multicomponent replication apparatus (21). 

Much is known about the replication of phages (24-26), 
viruses (1, 27), Prokaryota, (28) and Eukaryota (1, 3, 29, 30), 
which in general involves pols but also primases, helicases, 
RNaseH, sliding clamps, and other factors (31). Considerably 
less is known for archaeal replication, where mostly B type 
polymerases, similar to eukaryotic replication enzymes pol a 
and 8, have been identified (6, 22, 23, 32-34). This relative 
ignorance is surprising, because such crucial biotechnological 
applications as cloning and PCR require the thermostability 
and fidelity typical of archaeal polymerases (6). Thus, in 
addition to satisfying basic research interests, structural infor- 
mation could assist, for example, the engineering of variant 
enzymes with tailored nucleotide incorporation rates or the 
design of antiviral and anticancer polymerase inhibitors. For 
these reasons, we have determined the structure of a DNA 
polymerase from Thermococcus gorgonarius (Tgo\ an ex- 
tremely thermophilic sulfur-metabolizing archaeon isolated 
from a geothermal vent in New Zealand (35). This enzyme 
possesses pol and a 3' -* 5' exonuclease activity, which 
together ensure thermostable replication with high fidelity 
(error rate: 3.3-2.2 X 10" 6 ; see ref. 36). The 25 A structure 
shows a topological similarity to gp43 and gives insight in the 
structural biology of archaeal DNA polymerases, including the 
identification of several mechanisms for thermophilic adapta- 
tion. 



MATERIALS AND METHODS 

Materials. All materials were of the highest grade commer- 
cially available. 

Bacterial Strains. Escherichia coli LE392 containing 
pUBS520 was used as described (36). E. coli B834 (DE3) (hsd 
metB) was a generous gift of Nediljko Budisa (Max-Planck- 
Institut). 

Expression Vectors. PBTac2 was obtained from Roche 
Molecular Biochemicals. 

Abbreviations: Tgo, Thermococcus gorgonarius; pol, DNA polymerase. 
Data deposition: The atomic coordinates have been deposited in the 
Protein Data Bank, Biology Department, Brookhaven National Lab- 
oratory, Upton, NY, 11973 (PDB ID code 1TGO). 
TPresent address: The Scripps Research Institute, La Jolla, CA 92037. 
♦To whom reprint requests should be addressed, email: hopfner® 
scripps.edu. 
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Table 1. Data collection and isomorphous replacement 
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Native 


3.0 


136,953 


21,529 


91.9 


7.0 




0.31 
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3.2 
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PT1 


3.7 


65,464 


11,692 


98.2 


12.0 


22.6 


1.54 


PT2 
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53,870 


6,293 


66.9 


13.3 


13.8 


1.77 


PT3 


3.4 


85,226 


14,155 


92.8 


10.5 


17.1 


1.16 


PT4 


2.6 


387,925 


27,487 


90.6 


5.7 


35.5 


0.67 


PT5 


2.7 


358,695 


24,543 


88.7 


6.1 


36.2 


0.83 


PTU 


3.5 


79,629 


11,437 


82.4 


9.4 


24.7 


1.54 


PB 


3.5 


78,190 


11,336 


92.9 


11.5 


12.8 


0.41 


PBPT 


3.4 


94,557 


14,129 


91.6 


8.0 


19.6 


1.04 


OS 


2.8 


600,108 


24,026 


91.2 


5.3 


38.4 


1.82 



Overall figure of merit (15.0-3.0 A): 0.73 




for 

d; PT5, saturated cw-dichlorodipyridine-Pt(H) L_. 
saturated K 2 0s0 4 for 7 d. NAT1, PT1, PT2, PT3, PTU, PB and PBPT were collected with a Mar imaging plate and U was collected with a Bruker 
AXS area detector on a Rigaku rotating anode source. All other data sets were collected with a Mar charge-coupled device at beamline BW6 at 
DESY, Hamburg. 

(40), and several cycles of solvent flattening to 3.0 A by using 
solomon (40). At this stage, no interpretable density was 
found for a significant portion of the molecule, comprising 
residues 147-154, 283-306, 653-728, and 752-773. 

Model Building and Refinement. The partial model (R 
factor 35%) was used to phase the 25 A resolution data of the 
SQ-Tgo pol (high-salt conditions). The model was oriented with 
amore (40). The correlation coefficient of 22.0% and the R 
factor of 50.3% showed divergence of the high- and low-salt 
structures. After bulk solvent correction, anisotropic B factor 
correction and rigid-body minimization (treating five domains 
independently), the partial model was iteratively refined and 
extended with simulated annealing, Powell minimization, re- 
strained individual B factor refinement with cns (42), and 
manual model building with main (41) by using data from 
25.0-2.5 A resolution (Table 2, Fig. 1). 

RESULTS AND DISCUSSION 

Structure of Tgo pol. Tgo pol is a ring shaped molecule with 
dimensions 50 A X 80 A X 100 A. The single polypeptide chain 
of 773 aa is folded into five distinct structural domains (Fig. 2): 
the N-terminal domain (residues 1-130), the 3' 5' exonu- 
clease domain (131-326), the palm (369-449 and 500-585), 
fingers (450-499), and thumb (586-773) domains of the 
polymerase unit, and a helical interdomain insertion (327-368) 

Table 2. Crystallographic refinement, high-salt form 



Cloning, Expression, and Protein Purification. The gene for 
the 89.8-kDa DNA-dependent pol (Tgo pol) was cloned from 
Tgo (Deutsche Sammlungvon Mikroorganismen no. 8976) and 
expressed in E. coli LE392pUBS520 pBtac2Tgo (Deutsche 
Sammlungvon Mikroorganismen no. 11328) as described (36). 
Tgo pol was purified essentially as described (36) with the 
substitution of the TSK butyl Toyopearl column by Blue- 
Trisacryl M (Serva) and with an additional concentration step 
on Poros 50 HQ anion exchange medium (Roche Molecular 
Biochemicals). Active fractions were combined, concentrated 
to 30 mg/ml, and transferred to 20 mM sodium phosphate, pH 
8.2/10 mM 2-mercaptoethanol/500 mM NaCl. 

The gene for a selenomethionine-containing variant of Tgo 
pol (Se-Tgo pol) was expressed in E. coli B842 (DE3) (hsd 
metB) using a published protocol (37). Se-Tgo pol was purified 
by using the wild-type protocol. 

Crystallization. Crystals of purified Tgo pol (or Se-Tgo pol) 
were grown by using sitting-drop vapor-diffusion technique at 
4°C with high-salt conditions (2:2 /utl protein ireservoir— 100 
mM Tris, pH 8/2.0M ammonium sulfate) and diffracted to 3.0 
A (in-house) and to 2.5 A [beamline BW6 at Deutsches 
Elektronen Synchrotron (DESY), Hamburg]. Low-salt condi- 
tions (100 mM Tris, pH 7.0/200 mM ammonium sulfate/30% 
PEG 400) yielded only poorly diffracting crystals but allowed 
soaks (including heavy atoms) with some cell constant mod- 
ulation (a, b, c = 63.6, 105.0, 160.5) but minimal loss of 
resolution. 

Data Collection and Processing. Data were collected with a 
MAR imaging plate or a Bruker AXS X1000 mounted on a 
Rigaku rotating anode source, or with a MAR imaging plate 
or a MAR CCD (charge-coupled device) at beamline BW6 at 
DESY, Hamburg. The data were processed with saint (Bruker 
AXS), mosflm (Mar CCD; ref 38), or denzo (MAR imaging 
plate; ref 39), scaled with scala (40) or scalepack (39), and 
reduced with truncate (40). 

Structure Determination. The structure was solved by mul- 
tiple isomorphous replacement and anomalous scattering 
(MIRAS) by using data from crystals transferred to low-salt 
conditions (Table 1). Crystallographic calculations were done 
with programs from the CCP4 suite (40). Heavy atom positions 
of major sites were located in difference Patterson maps and 
were refined with mlphare (40) to calculate protein phase 
angles to 3.5 A resolution. A partial polyalanine model was 
built into interpretable portions of secondary structural ele- 
ments of the miras map by using main (41). The quality of the 
electron density was improved by phase combination of the 
partial model with the experimental phases by using sigmaa 
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Fig. 1. Stereorepresentation of the electron-density map. The 2 (F G - F c ) electron density contoured at 1.0 a at the polymerase active site is 
well defined for the refined model (stick representation). 



between the exonuclease and palm domains. The polymerase 
unit forms the DNA-binding crevice, reminiscent of a right 
hand, which is the identifying characteristic of pols. Gp43 from 
bacteriophage RB69 also shows this overall domain topology 
(21). 

Three clefts extend radially from the polymerase active site 
at the center of the ring: two of them in opposite directions, 
forming a large cleft across the molecule, and one perpendic- 
ular to these. Based on active-site homology to pol A family 
enzymes, the two opposite clefts probably bind duplex DNA 
(cleft D, according to ref. 21) and single-strand template DNA 
(cleft T), respectively. The perpendicular (editing) cleft links 



the polymerase active site and the exonuclease active site and 
binds the primer strand in editing mode (21). 

The exonuclease domain is structurally equivalent to the 3' 
-* 5' exonuclease domain of pol A family (43). Like gp43, 
however, it is bound at the opposite side of the polymerase unit 
by noncovalent contacts to the thumb domain at the editing 
cleft, on one side, and by covalent and noncovalent contacts to 
the N-terminal and palm domains and the 42 residue inter- 
domain helix, on the other side. This latter segment is located 
at the base of cleft T, which is additionally bounded by the 
exonuclease, N-terminal, and palm domains. 

The topology of the palm domain is conserved among 
polymerase families (5), with two long helices (Q and R) 



Tgo DNA polymerase 



Polymerase 
active site 




Thumb V 



Cleft T 

Exonuclease 
active site 




RB69gp43 




Fig. 2. Structure of Tgo pol and comparison with gp43 form bacteriophage RB69. (Left and middle) Ribbon representation of Tgo DNA 
polymerase (Upper) in two orthogonal views with labeled secondary structure elements. The molecule is composed of five domains: N-terminal 
domain (yellow), 3' 5' exonuclease (red), palm [light and dark magenta represent the N-terminal and C-terminal segment, respectively (see text)], 
fingers (blue), and thumb (green), which are arranged to form a ring. An interdomain helical segment between the exonuclease domain and the 
palm is orange. The conserved carboxylates in the active site and the two disulfide bridges are shown as magenta and yellow ball-and-sticks, 
respectively. Tgo pol has the same overall architecture and domain topology than gp43 of RB69 (Lower). The 50-residue insertion in the fingers 
of gp43 is gray. (Right) Comparison of molecular surfaces of Tgo pol (Upper) and RB69 gp43 (Lower). Red and blue denote negative and positive 
electrostatic surface potentials, respectively. In contrast to gp43, Tgo pol has a strongly enhanced positive potential at the putative DNA-binding 
clefts. 
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Fig. 3. Sequence alignment of B family DNA polymerases. The 
alignment has been adapted from ref. 21 to highlight specific residues 
from the class of archaeal pols. The secondary structure of Tgo pol is 
indicated on top of the alignment with helices (bars), strands (arrows) 
and loops (lines) colored according to domains with the same color 
code as Fig. 2B. Strictly conserved residues of type B polymerases are 
red, and additional conserved residues are yellow. Uniquely conserved 
residues of archaeal type B enzymes— as discussed in the text— are 
green. Disulfide bonds are shown by a bar on top of the alignment. 
Abbreviations: tgo, Thermococcus gorgonarius pol; pfu> Pyrococcus 
furiosus pol; tsp, Thermococcus sp pol; tli, Thermococcus Utoralis pol; 
mvo, Methanococcus voltae pol; RB69, bacteriophage RB69 pol; T4, 
bacteriophage T4 pol; Eco; E. coli pol II; pol 6; human pol 6. 

packed against the five-stranded antiparallel ^-sheet that 
contains the three conserved aspartate residues involved in 
nucleotidyl transfer. The fingers emerge from the palm domain 
as an a-helix-rich insertion. Its 50 residues are folded into two 
antiparallel coiled a-helices of approximately equal size: helix 
P contains the conserved Kx3NsxYGx 2 G motif of B type 
polymerases and is related to the O helix of A type enzymes 
(see below). The «50-residue insertion between helix O and P 
in RB69 and T4 gp43 is missing in Tgo pol, where both helices 
and 4 residue linker are much shorter than their equivalents in 
gp43. The shorter fingers of Tgo pol presumably reflect the 
typical structure of the nonbacteriophage B type fingers (pol 
a, pol 5, and £. coli pol II). The thumb domain topology, 
similar to that of gp69, is unrelated to other polymerase types. 
However, like the thumb of A type enzymes, a bundle of 
a-helices at its base protrude from the active site 0-sheet. 
Distal to the active site, the thumb contains a 75-residue 
subdomain (665-729), which feces the exonuclease domain and 



contributes to the editing channel, explaining why mutations in 
the exonuclease domain of B-type polymerases affect the 
polymerase activity and vice versa (44; 45). 

Weakly defined density across the base of the thumb domain 
was modeled as the C-terminal 6 residues with a polyalanine 
chain. The C terminus thus does not protrude from the core 
molecule as in the RB69 polymerase (21). Because the C 
terminus of the T4 pol are involved in sliding-clamp binding 
(46), it is likely, however, that these residues become ordered 
on any similar holoenzyme formation. 

Sequence Alignment of Archaeal DNA Polymerases. The 
structure of Tgo pol allows the generation of a structure based 
sequence alignment of the archaeal subfamily of type B DNA 
polymerases, the location of conserved and unique residues, 
and the comparison with other type B DNA polymerases (Fig. 
3). 

Polymerase Active Site. The polymerase active site is formed 
by the central j3-sheet (strands 16, 17, 20, 21, and 22) and helix 
N of the palm domain and helix P located in the fingers and is 
highly conserved among B family polymerases (Fig. 4). Three 
carboxylates required for nucleotidyl transfer in B family 
polymerases, two of which coordinate two metal ions (14) are 
superimposably conserved among A family enzymes, B family 
enzymes, and reverse transcriptase (21). Superposition of Tgo 
pol and T7 replication complex (14) places the dNTP near the 
proposed nucleotide-binding site in helix P, the 
K487x3NSxYGx 2 G motif (Fig. 3) and suggests interactions of 
the carboxylates with metals and the phosphate tail of the 
bound dNTP (Fig. 4). Reorientation of the strictly conserved 
Lys-487 allows it to mimic the Lys-522-phosphate tail inter- 
action in T7. Tyr-494 (ICx3NSxY494Gx2G) and Tyr-409 
(SLY409PSII) form the bottom of the nucleotide-binding site. 

The active site of B family polymerases contains a DTDS 
motif, which, however is DTDG in the archaeal subfamily. In 
Tgo pol, the relatively conserved Tyr-402 from the adjacent 
strand provides an alternate alcohol group, at a position 
appropriate for metal coordination or binding of the 3' end of 
the primer. The orientation of iyr-402 is stabilized by an 
aromatic cluster that also includes Phe-545 and l^r-538. 
Archaeal Methanococcus voltae and Thermococcus sp. poFs 
(see Fig. 3) have iyr at position 545— Phe in Tgo — rather than 
at 402, but might also supply an alcohol group. The displace- 
ment of a functional alcohol from serine in DTDS to T]yr-402 
or iyr-545 might stabilize its orientation as an adaptation for 
thermostability. 

The conserved cluster of acidic amino acids C E578, E580) 
form an unexpected metal-binding site for Mn*+ and Zn 2+ 
(Fig. 4). Its proximity to Asp-404 and to the expected location 
of the diNTP 7-phosphate suggests a supporting role in nucle- 
otide binding and/or catalysis. 

3' -> 5' Exonuclease Active Site. Tgo pol is characterized by 
a strong 3' ->5' exonuclease activity, unlike eukaryotic B type 
polymerases (unpublished results). The exonuclease active site 
is formed at the interface between the exonuclease domain and 
the tip of the thumb (Fig. 5). All residues required for catalysis 
are located in the exonuclease domain, which, at least for T4 
gp43, retains activity when dissociated from the polymerase 
(47). However, the thumb domain, with, for example, RB69 
gp43*s Phe-123-base intercalation, partially controls the bind- 
ing geometry of single strand DNA (21, 43). 

The exonuclease structures of Tgo and gp43 DNA poly- 
merases are similar at the editing site but differ considerably 
at the exonuclease-thumb interface. Strand 10 contains the 
metal-binding D141IE motif and readily superimposes with 
the equivalent strand from gp43, allowing modeling of a 
single-strand DNA segment into the exonuclease site based on 
theRB69 gp43-p(dT) 4 complexes (21). The conserved residues 
Asp-141 and Glu-143 in the Exo I motif, Tyr-209, Asn-210, 
Phe-214, and Asp-215 in Exo II, and Tyr-311 and Asp-315 in 
Exo III are in approximate DNA-binding conformations (Fig. 
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Fig. 4. Polymerase active site. (A ) Stereoribbon representation (using color code as in Fig. 2) with modeled DN A. Active-site residues are shown 
as ball-and-stick representations with carbon (green), nitrogen (blue), and oxygen (red) atoms. The DNA template (light brown), primer (light 
brown), and dNTP (orange) complex has been taken from the coordinates of T7 replication complex (15) by superimposing D404, D542, and 
adjacent residues with corresponding residues in T7 pol (D475 and D654). Phosphorus atoms are yellow. The two metals of the T7 replication 
complex are shown as magenta spheres. (B) Experimentally observed metal-binding site for Mn 2+ {F Q - F c density contoured at 5<r) and Zn 2+ 
in the "low salt" crystal form. The carboxylates E578 and E580 are conserved in type B polymerases (Fig. 3). 



5/4). However, the editing cleft is constricted by a displacement 
of the tip of the thumb toward the exonuclease domain to 
prohibit single-strand binding (Fig. 5B). This shift is correlated 
with a large change in the loop between strands 10 and 11. In 
RB69 gp43 (and likewise in T4 gp43), this loop forms a lid over 
the 3' base and contains Phe-123, which intercalates between 
the first two bases. In Tgo pol, the loop is curved outward, away 
from the thumb, and Phe-152 (the equivalent of gp43's Phe- 
123) attaches to Phe-214 10 A away from intercalation site. 
This shift allows the tip of the thumb to move into the editing 
channel and to block the exonuclease site. 

Are There Different Conformations in Polymerase and 
Editing Mode? If a closed conformation of the exonuclease 
domain prohibits single strand binding, an open conformation 
is required for editing. The observed closed conformation may 
represents the enzyme in "polymerase" mode. Preliminary 
analysis of the crystal structure of Tgo pol in the low-salt 
conditions indicates a structural change at the interface of 
exonuclease and thumb, possibly reflecting a transition be- 



tween open and closed forms. The closed conformation ob- 
served here may, however, be a nonphysiological artifact of the 
high ionic strength used for crystallization. Crystal structures 
of the enzyme in both polymerase and editing modes are 
required. 

Adaptation to High Temperatures. Tgo is a sulfur- 
metabolizing, extremely thermophilic archaeon, with a growth 
range between 55°C and 98°C. For accurate replication at this 
temperature range, the polymerase must not only be stable, but 
must also adequately bind substrate DNA. A comparison with 
gp43 from the mesophilic bacteriophage RB69 indicates sev- 
eral such adaptations to high temperatures. Several loops are 
shorter in Tgo pol than in gp43 (Fig. 2), and there is an increase 
in hydrogen bonded ^-strand content: Tgo pol secondary 
structure includes 41% helix, 22% /3-strands, and 19% turns 
(calculated according to ref. 48), whereas gp43 has 42% helix, 
17% 0-strands, and 19% turns. 

Although rare among cytoplasmic or nuclear proteins, two 
disulfide bridges might be formed: cysteine pairs 428-442 and 




Fig. 5. 3' 5' exonuclease active site. (A) Stereoribbon representation with modeled DNA using the color code of Fig. 4A (ball-and-stick) 
and Fig. 2 (ribbons). Active-site residues are shown as ball-and-stick representation. The single-stranded DNA has been taken from the coordinates 
of RB69-single-stranded DNA complex (21). The orientation of the DNA has been obtained by superimposing the DIE143 motif in Tgo pol with 
corresponding motif of RB69 gp43 (DIE116). Strand 27 and its preceding loop from the thumb (green) is apparently in collision with the modeled 
DNA. (B) Comparison of the exonuclease-thumb interface between Tgo pol (color code of Fig. IB) and RB69 gp43 (gray). In Tgo pol, the lid of 
the editing site (red) is bent outward compared with the equivalent loop of gp43 (yellow), allowing the tip of the thumb to move several A (\) 
closer to the exonuclease. This conformation is incompatible with formation of an editing complex [the p(dT) 4 of gp43 is shown as brown space-filling 
model]. 
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506-509, although reduced, are poised for attachment (Fig. 2). 
Mode! refinement and electron density inspection with and 
without constraints for the disulfide bonds verified the reduced 
state (observed unrestrained SG-SG distance: 2.8 A and 3.0 
A). This is consistent with our E. coli expression and further 
rules out structural perturbation by nonnative oxidation. These 
cysteines are located in the palm domain and are conserved 
among B type enzymes from hyperthermophilic sulfur- 
metabolizing archaeons, but not among mesophile homologs 
(Fig. 3), The Cys-428-Cys-442 bridge stabilizes the compact 
fold of the loop segment between helix N in the palm domain 
and helix O in the fingers and presumably also the relative 
orientation of these helices. In addition, the loop segment 
packs against helix Q in the palm domain. Helix Q, the spine 
of the palm domain, is further stabilized at the first helical turn 
by the second disulfide bridge between Cys-506 and Cys-509. 

A much enhanced complementary positive potential for all 
three DNA-binding clefts of Tgo pol is observed relative to 
gp43 (Fig. 2). Thus, in addition to hydrogen bonding and 
specific DNA-protein interactions, binding to Tgo pol has an 
additional strong stabilizing electrostatic component. 

An increase in the number of salt bridges is often associated 
with thermostability. Although Tgo pol has a greater total 
number of charged residues (262) than gp43 (245), both 
molecules have 54 salt bridges within a 3-5 A bound. However, 
in the 5-7 A range of charge distance, Tgo pol has 77 ion pairs 
compared with 43 for gp43. This large increase results in a 
more highly charged surface of Tgo pol, accompanied by a 
more balanced charge distribution, compared with gp43 where 
charges are often located in patches (Fig. 2). 

We thank Dr. Uwe Jacob and Martin Augustin for stimulating 
discussions, Dr. Nediljko Budisa for help with expression of selenome- 
thionine variants, and Dr. Hans Bartunik and Dr. Gleb Bourenkov at 
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Crystal structure of an archaebacterial DNA polymerase 

Yanxiang Zhao 1 *, David Jeruzalmpt, Ismail Moarefi 1 ' 2 , Lore Leighton 1 < 2 f 
Roger Lasken 3 and John Kuriyan 1 * 2 * 



Background: Members of the Pol II family of DNA polymerases are responsible 
for chromosomal replication in eukaryotes, and carry out highly processive DNA 
replication when attached to ring-shaped processivity clamps. The sequences 
of Pol II polymerases are distinct from those of members of the well-studied 
Pol I family of DNA polymerases. The DNA polymerase from the 
archaebacterium Desulfurococcus strain Tok (D. Tok Pol) is a member of the 
Pol II family that retains catalytic activity at elevated temperatures. 

Results: The crystal structure of D. Tok Pol has been determined at 2.4 A 
resolution. The architecture of this Pol H type DNA polymerase resembles that 
of the DNA polymerase from the bacteriophage RB69. with which it shares less 
than -20% sequence identity. As in RB69, the central catalytic region of the 
DNA polymerase is located within the 'palm' subdomain and is strikingly similar 
in structure to the corresponding regions of Pol I type DNA polymerases. The 
structural scaffold that surrounds the catalytic core in D. Tok Pol is unrelated in 
structure to that of Pol I type polymerases. The 3-5' proofreading exonuclease 
domain of D. Tok Pol resembles the conesponding domains of RB69 Pol and 
Pol I type DNA polymerases. The exonuclease domain in D. Tok Pol is located 
in the same position relative to the polymerase domain as seen in RB69, and on 
the opposite side of the palm subdomain compared to its location in Pol I type 
polymerases. The N-terminal domain of D. Tok Pol has structural similarity to 
RNA-binding domains. Sequence alignments suggest that this domain is 
conserved in the eukaryotic DNA polymerases 5 and e. 
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Conclusions: The structure of D. Tok Pol confirms that the modes of binding of 
the template and extrusion of newly synthesized duplex DNA are likely to be 
similar in both Pol II and Pol I type DNA polymerases, However, the mechanism 
by which the newly synthesized product transits in and out of the proofreading 
exonuclease domain has to be quite different. The discovery of a domain that 
seems to be an RNA-binding module raises the possibility that Pol II family 
members interact with RNA. 



Introduction 

DNA polymerases can be classified into at least three fam- 
ilies on the basis of sequence similarities to the three dis- 
tinct DNA polymerases of Escherichia colt, Pol I, Pol II and 
Pol III [1]. Members of the Pol I family have been studied 
extensively, resulting in a comprehensive understanding 
of their, functional properties and their structure [2-6], In 
contrast to the detailed knowledge that is now available 
for the Pol I family, the Pol II and Pol III polymerases are 
poorly understood. The first crystal structure determined 
for a Pol II family member was that of the DNA poly- 
merase of the bacteriophage RB69 (RB69 Pol) [7] and no 
structural information is currently available for any 
member of the Pol III family. Members of the Pol II (also 
known as Pol B or Pol a) and Pol III families carry out pro- 
cessive replication of chromosomal DNA during cell divi- 
sion [8], and there is interest in further extending our 



knowledge of their structures and mechanism Archaebac- 
terial DNA polymerases and the eukaryotic DNA poly- 
merases a, 8 and e are members of the Pol II family [1]. 

The structure of RB69 Pol revealed that the general archi- 
tecture of the core of the Pol II polymerases is strikingly 
similar to that of the Pol I polymerases [7]. Pol I poly- 
merases are constructed from three smaller subdomains, 
termed the thumb, palm and fingers regions by analogy to 
elements first noted in the structure of the Klenow frag- 
ment of E. colt DNA polymerase I [9]. In addition, Pol I 
DNA polymerases have a proofreading 3-5' exonuclease 
domain located below the thumb subdomain, near the 
region where duplex DNA exits the polymerase active site 
[4,5]. Besides the residues involved in catalysis, there is no 
significant sequence similarity between the polymerase 
domains of members of the Pol I and Pol II families [1]. 
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However, the subdomain architecture of the Pol I family is 
conserved in the RB69 structure, even though the detailed 
structures of the subdomains are quite divergent [7], The 
exonuclease domains of Pol I and Pol II DNA polymerases 
are closely related in sequence and, not surprisingly, the 
structure of the exonuclease domain of RB69 resembles 
that of the Pol I type polymerases. Given the general simi- 
larity in the polymerase domains of the Pol I polymerases 
and RB69, the location of the exonuclease domain in RB69 
was a surprise. In RB69 the 3-5' exonuclease domain is 
located above the fingers and opposite the thumb subdo- 
mains, suggesting that the shuttling of DNA between the 
polymerization and proofreading sites must occur by a dif- 
ferent mechanism in Pol II DNA polymerases [7]. 

The mechanism of the Pol I family DNA polymerases is 
now understood in detail [4,5,10,11,28]. The chemistry of 
nucleotide addition is mediated by two metal ions that are 
liganded by two aspartate residues. These are located in 
the palm subdomain, at the base of a deep cleft in the poly- 
merase domain. High-resolution crystal structures of the 
Pol I type DNA polymerases of T7 bacteriophage (T7 Pol) 
and Themtus aquaticus (Taq Pol) complexed to primer-tem- 
plate DNA and incoming nucleotide have been deter- 
mined, allowing the mechanisms of nucleotide 
incorporation and selectivity to be visualized [10,11,28]. 
Although corresponding structural information for the Pol 
II family DNA polymerases is lacking, similarities in 
general organization of the polymerase core as well as 
sequence conservation within crucial elements of the 
central palm subdomain suggest that general features of the 
recognition of DNA will be similar in Pol II polymerases. 

The DNA polymerase from the archaebacterium Desul- 
furococcus strain Tok (D. Tok Pol) is a member of the Pol 
II family, and has both thermostable DNA polymerase 
and 3-5' exonuclease activities [12]. D. Tok Pol sustains 
undiminished DNA polymerase activity after incubation 
at 95°C for one hour (RL, unpublished results). The 
sequence of D. Tok Pol is very closely related (> 75% 
identity) to that of other archaebacterial DNA poly- 
merases, such as those from Pyrvcoccus furiosus [13] and 
Thermococcus Vtttoralis [14]. D. Tok Pol is also related to 
eukaryotic DNA polymerases a, 8 and e (34% sequence 
identity over 196 residues of the DNA polymerase core for 
the human 5 sequences) [1]. The archaebacterial genomes 
also contain genes coding for proteins with clear homology 
to proliferating cell nuclear antigen (PCNA), the DNA 
polymerase clamp in eukaryotes, as well as subunits of the 
clamp-loader complex RF-C (replication factor C). It is 
likely that archaebacterial DNA polymerases achieve pro- 
cessivity by attachment to the ring-shaped PCNA ring, 
although direct evidence for such a mechanism is lacking. 

We have determined the structure of D. Tok Pol at 2.4 A 
resolution. D. Tok Pol shares less than 20% sequence 



identity with RB69 Pol, but the structures of the two 
enzymes resemble each other closely. The structure 
reported here has been determined in the absence of 
DNA. Nevertheless, the close structural correspondence 
between the active sites of Pol I and Pol II DNA poly- 
merases allows inferences to be made about the mode of 
DNA recognition by D. Tok Pol. The very N-terminal 
region of D. Tok Pol contains a domain (residues 1-132) 
that is closely related in structure to single-stranded RNA- 
binding domains (RBDs), also known as RN A- recognition 
modules (RRMs) [15]. The structure of the 3-5' proof- 
reading exonuclease domain of D. Tok Pol is similar to 
those of the Pol I type polymerases. However, its location 
relative to the palm subdomain resembles the location 
seen in RB69 [7] rather than the Pol I type polymerases 
[9,16,23]. The structure of D. Tok Pol reported here pro- 
vides further evidence that the mode of DNA-template 
recognition and the distinct editing channel established 
for the Pol II family by the structure of RB69 Pol is valid 
for the entire Pol II family. 

Results and discussion 

Structure determination 

Crystals of D. Tok Pol have been obtained from 
2,4-methylpentanedioI (MPD) (Native I) and polyethyl- 
ene glycol (PEG) 400 (Native II). Both crystal forms are 
orthorhombic (P2,2,2 1 ; a - 64.8 A, b = 107.6 A, c = 153.2 A 
for Native I and a = 66.1 A, b = 107.6 A, c- 155.9 A for 
Native II). Experimental phases (Table 1) to 3.0 A were 
obtained from four isomorphous heavy-atom derivatives, 
using Native II and the program SHARP [17]. Phases were 
improved by iterative cycles of real-space density modifi- 
cation, consisting of solvent flipping and negative density 
truncation, using SOLOMON [18,19]. The resulting elec- 
tron-density map allowed the chain to be traced unam- 
biguously, with ready determination of sequence register. 
The model was refined to 2.6 A against data for Native II 
(R value = 24.2%, R frce = 29.5%) and subsequently to 2.4 A 
against data for Native I (R value = 25.3%, Rf rcc = 29.9%), 
using CNS [20]. The model for Native II is somewhat 
more complete (see the Materials and methods section) 
and is used for most of the discussion. This model includes 
740 residues from 1 to 756 in Native II. Amino acids 
386-390 and 665-676 are not visible in our electron- 
density maps and are not included in the model. 

General description of the structure 

D. Tok Pol (Figure 1) is composed of a polymerase 
domain (residues 390-773) and an exonuclease domain 
(residues 133-385), as well as an N-terminal domain 
(residues 1-131) that is not found in Pol I type DNA poly- 
merases [4]. The polymerase domain is further comprised 
of three smaller subdomains, termed the thumb (residues 
607-756), palm (residues 390-445 and 500-606) and 
fingers (residues 446-499). The structures of the MPD 
and PEG400 crystal forms of D. Tok Pol are very similar 
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Table 1 



Data collection, structure determination and refinement statistics. 
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calculated heavy atom structure factor amplitude. § Figure of 

in terms of the individual subunits. The major difference 
between the two structures is a rotation of -8-10° in the 
orientation of the exonuclease domain with respect to the 
thumb subdomain. 

The domains of D. Tok Pol are arranged as an irregularly 
shaped flattened ring with a central cavity located near 
the polymerase active site. The mostly a-helical thumb 
subdomain forms one side of the active-site cleft and 
makes contacts with the exonuclease domain (Figure 1). 
The structures of the thumb domains of various poly- 
merases are often unrelated in structure. However, in all 
cases where structures are available the thumb domain is 
seen to fulfil an important role by forming contacts with 
duplex DNA as it exits the polymerase active site [4]. The 
D. Tok Pol structure has been determined in the absence 
of DNA, and a portion of the thumb subdomain that is 
likely to contact DNA (residues 665-676) is disordered. 
This is commonly observed for the corresponding regions 
of other polymerases in the absence of substrate 
[9,21-24]. In the DNA polymerases from bacteriophage 
T4 and RB69, the thumb subdomains also provide a 
Oterminal element that interacts with the processivity 
clamp [25,26]. In D. Tok Pol, the corresponding region 
(residues 757-773) is disordered. 

The central region of the active-site cleft is occupied by 
the palm subdomain and includes residues important for 
substrate discrimination and the catalysis of the poly- 
merase reaction. In D. Tok Pol, the palm is organized 
around three 0 strands (016, 019, 020) flanked by an a 
helix (aQ) (Figures l,2a,3a). It contains two disulfide 



merit = <|XP(a)e tot /£|P(a)|>. where a is the phase and P(ct) is the 

phase probability distribution. 

# Rworking = L|F(obs) - F (calc)| / ZF(obs). 

* ^ = I|F(obs) - F (calc)| / IF(obs), calculated using 1 0% of the 

data. Numbers in parentheses apply to the highest resolution shell. 

bonds (Cys428-Cys442, Cys506-Cys509) that have not 
been previously observed in palm subdomains and which 
may be important for thermostability (Figure 1). 

Figure 1 




Structure of D. Tok PoL The structure Is represented by cylinders for 
helices, arrows for strands, and a thin worm for other secondary 
structural elements. Two gray spheres represent metal ions (presumed 
to be Mg 2+ ) observed to be bound to the exonuclease domain. The 
active site of the polymerase is marked by the location of two 
aspartate residues D404 and D542. The two disulfide bonds are 
Indicated. Regions of the polypeptide chain that could not be modeled 
in the palm subdomain because of disorder are indicated by dotted 
lines. The various domains and subdomains and their boundaries are 
indicated in the bar. 
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Figure 2 
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Comparison of DNA polymerase structures, 
(a) A view of the secondary structural 
elements of the polymerase active-site region 
(palm and fingers subdomains) of D. Tok Pol, 
colored as in Figure 1 . (b) The corresponding 
region of T7 DNA polymerase including the 
primer-template duplex from the crystal 
structure (PDB code 1T7P [111). The 
orientation of T7 Pol was derived by 
superposition onto strands (*1 6, p1 9, and p20 
of D. Tok Pol. D. Tok Pol helix aP is seen to 
be in an analogous position relative to the 
active-site aspartates as T7 Pol aO. (c) A 
GRASP surface representation of D. Tok Pol 
with modeled primer-template duplex from the 
T7 DNA polymerase-DNA complex (PDB 
code 1 T7P [1 1 ]). The surface is colored 
according to sequence similarity (40-100%) 
calculated as in Figure 7c. The primer strand 
is an orange worm representing phosphate 
positions, and the template strand is in gray. 



The central elements of the palm subdomains from poly- 
merases belonging to the Pol I and Pol II families can be 
aligned closely (the root mean square deviation [rmsd] in 



Ca positions for strands pi6, P19, |320 and helix ccQ is in 
the range of 0.9-2.0 A), indicating a potential conservation 
of function. There are two residues in the palm domains 



Figure 3 



Fingers 




0. Tok Pol 



RBG9 Pot 



Comparison of the structures of D. Tok Pot 
and RB69 Pol. The structures of 
(a) D. Tok Pol and (b) RB69 Pol are 
presented in the same orientation after 
superposition of their respective palm 
subdomains Structural elements that are in 
common between the structures are 
represented and colored as in Figure 1. 
Elements that are unique to RB69 Pol are 
colored in gray. Disordered segments are 
indicated by dotted lines. The N-terminal and 
the exonuclease domains are not shown. 
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of Pol I polymerases that are crucial for enzymatic activity 
because they coordinate two metal ions [2,10,11,27]. The 
corresponding residues in D. Tok Pol are Asp404 and 
Asp542 (Figure 1). No metal ions are, however, visible in 
our electron-density maps. 

The fingers subdomain in D. Tok Pol consists of a set of 
antiparallel a helices (aN, aO, aP; Figure 2). These 
helices are shorter in length than the corresponding ele- 
ments of RB69 Pol, and a helical segment that connects 
helices O and N in RB69 Pol is missing altogether 
(Figure 3). The fingers domain of D. Tok Pol is unrelated 
in overall structure to that of Pol I type polymerases 
(Figure 2). However, helix ocP in D. Tok Pol is positioned 
similarly to helix O in Pol I polymerases (Figure 2), and is 
likely to play an analogous and crucial role in recognition 
of the incoming nucleotide [9-1 1,28]. 

The 3'-5' exonuclease domain in D. Tok Pol is located 
opposite the thumb subdomain and above the fingers sub- 
domain, as noted for RB69 Pol. It contains two metal ions 
(presumably Mg 2 *) ligated to Asp 141 and Glul43 
(Figure 1). The position of this domain relative to the 
polymerase active site is distinct from the arrangement 
seen in Pol I type polymerases. The conservation between 
RB69 and D. Tok Pol of the location of the exonuclease 
domain suggests that this is a characteristic feature of Pol 
II type polymerases. The structure of the D. Tok Pol 
3'-5' exonuclease domain resembles those associated with 
other DNA polymerases [29,30]. The 3'-5' exonuclease 
domains from the Pol I (£. coli, T. aquaticus, Bacillus sub- 
tilts, bacteriophage T7) or Pol II (RB69) polymerase fami- 
lies can be aligned onto each other closely (rmsd in Cot 
positions for strands {J10, pi 1, pi2, P14 and helices aE and 
eel is in the range of 1.0-2.8 A). This alignment superim- 
poses residues associated with substrate binding, catalysis 
and metal binding in a satisfactory manner (Figure 4) [4]. 

The arrangement of the N-terminal, exonuclease, and 
polymerase domains creates two deep grooves leading into 
and out of the polymerase active site. The D groove (for 
duplex- DNA binding, following the nomenclature of [7]) 
is located immediately below the thumb subdomain and 
includes a region of positive electrostatic potential. The 
T groove (for template-DNA binding) leads away from 
the active site in the opposite direction and is located 
below the fingers subdomain. A small channel (the editing 
channel) leads from the polymerase domain to the exonu- 
clease active site (Figures 2c). 

We have used the structure of T7 Pol bound to primer- 
template DNA to model DNA onto D. Tok Pol 
(Figure 2c). Superposition of the palm subdomains of the 
two polymerases shows that remarkably few bad contacts 
are formed between the DNA (from T7 Pol) and atoms in 
the D. Tok Pol model. The one region that does collide 



Figure 4 



Model*** OKA fttUtstfai® 
rrom RBG9 Pol 




Structural alignment of exonuclease domains. Structures of 
exonuclease domains from KF, 1 WAJ, 1T7P. 1 BDF, and 1TAQ have 
been aligned by superimposing residues 1 37-145. 1 58-1 64, 
1 67-1 72, 205-220, 257-260, and 303-31 3, which represent 
strands p8. £9, 010, 01 2, 01 5 and helix aE, al. A color gradient is 
used to depict the average rmsd for the family of superimposed 
structures ranging from blue (1.0-1 .5 A) to white (> 4.0 A). Residues 
conserved amongst exonuclease sequences and implicated in 
catalysis are drawn in green ball and stick representation. Two gray 
spheres represent two metal ions bound at the active site. The active 
site is also indicated by a tetra nucleotide (in gold) derived from 
superposition of the exonuclease domain from the RB69 Pol structure. 



with the DNA is the segment connecting the exonuclease 
and polymerase domains. This region (residues 377-390) 
is partially disordered in the D. Tok Pol structures, and is 
likely to reorganize upon binding DNA. This superposi- 
tion allows five base pairs of DNA to be accommodated in 
the D. Tok Pol active site, with the formation of 
DNA-protein contacts. The formation of contacts with 
additional base pairs would require a change in the posi- 
tion of the thumb subdomain in the region of the 
D groove. A change in the conformation of the fingers sub- 
domain (helices aO and aP) is also required to position 
residues Lys487 and Tyr493 (or Tyr494) of D. Tok Pol 
(Figure 2) for interaction with the incoming nucleotide, by 
analogy with the T7 Pol structure [11], Finally, the super- 
imposed primer-template DNA is well positioned so that 
the incoming template strand will probably reside in the 
T groove. Superposition of the DNA molecule derived 
from the structure of HIV-1 reverse transcriptase com- 
plexed to DNA [31] leads to similar conclusions. 

Comparison between D. Tok Pol and RB69 Pol 

Although the DNA polymerases from D. Tok Pol and bac- 
teriophage RB69 share less that 20% primary sequence 
identity (Figure 5), their structures resemble each other 
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Figure 5 



Structure-based sequence alignment for D. Tok Pol (DTOK), RB69 Pol 
(RB69) and human Pol S (HUMd). The HUMd sequence begins at 
residue 1 10, as Indicated by the number at the beginning of the 
sequence. The alignment is colored by sequence similarity (40%, white 
to 100%, green) calculated as described in Figure 7c. Shown here is a 
small subset of a larger set of sequences that were used to generate 
the alignment The full sequence alignment Is available at 
http://www.rockefeller.edu/Kuriyan. The respective secondary 
structural elements colored as in Figure 1 are represented by helices 
as cylinders, strands as arrows, and other as thin lines. Gray circles 
represent portions of the polypeptide chain that could not be modeled. 



closely (Figure 3). Not surprisingly, the regions of highest 
sequence similarity are concentrated in and around the 
exonuclease and polymerase active sites (Figures 2c,5). 
Despite the low overall sequence identity, the individual 
subdomains in the two structures superimpose well (the 
rmsd in Ca positions in the fingers, thumb and palm sub- 
domains is in the range of 0.8 to 1.5 A). Moreover, the 
overall arrangement of domains and subdomains with 
respect to each other is preserved in the two polymerases, 
strengthening the proposal that Pol II DNA polymerases 
share a common architecture (Figure 3). 

One difference between the overall structures of 
D. Tok Pol and RB69 Pol concerns the orientation of the 
exonuclease domain with respect to the rest of the structure. 
When the two polymerases are superimposed on their 
respective palm subdomains it is seen that the exonuclease 
domain of RB69 is rotated inwards by -8°, burying the 
active site in a solvent-inaccessible configuration [7]. In con- 
trast, the exonuclease domain in D. Tok Pol has its active 
site essentially exposed to solvent. It is possible that confor- 
mational changes between open and closed configurations 
of the exonuclease domain are a part of the functional cycle 



of the protein, particularly as the two different forms of 
D. Tok Pol differ in the orientation of the exonuclease 
domain (not shown). 

One interesting difference between D. Tok Pol and RB69 
Pol is that the former is a thermostable DNA polymerase 
whereas the latter is not. Unfortunately, attempts to iden- 
tify features in the D. Tok Pol structure that might be cor- 
related with thermostability is complicated by the very low 
sequence similarity between the two enzymes. One feature 
that does stand out, however, is the increased formation of 
arrays of ionic interactions on the surface of D. Tok Pol 
when compared to that of RB69 Pol (Figure 6). The forma- 
tion of networks of ionic interactions has been noted to cor- 
relate with thermostability in other proteins [16,32,33]. 

Generally, D. Tok Pol subdomains tend to be more 
compact, with smaller helices and shorter loops than are 
found in RB69 Pol, a feature that may be another important 
source of thermostability. For example, the palm subdo- 
main displays close structural conservation of elements near 
the catalytic aspartate residues. However, helix ccR in 
D. Tok Pol is much shorter that its counterpart in RB69 
Pol, and a small substructure in front of the palm subdo- 
main is entirely missing in D. Tok Pol (Figures 3,5). Dele- 
tion of these elements is also seen in a representative set of 
archaebacterial DNA polymerases [13,14]. Likewise, the 
fingers subdomain is missing a large mass of from its tip in 
D. Tok Pol (Figures 3,5). However, the RB69 fingers 
extension most probably plays a T4 phage-specific role, as 
it is also missing from our alignments of archaebacterial 
DNA polymerases and eukaryotic polymerases 5 (Figure 5). 

The N-terminal domain resembles RNA- binding domains 

The N-terminal domain of D. Tok Pol has no correspond- 
ing element in Pol I type polymerases. Analysis of the 



Figure 6 



Comparison of surface charges in D. Tok Pol 
and RB69 Pol. Accessible-surface 
representation of (a) D. Tok Pol and (b) RB69 
Pol in the same orientation after superposition 
of their palm subdomains. Surface regions 
corresponding to the terminal oxygen atoms 
of aspartate and glutamate are colored red, 
whereas surface regions contributed by the 
sidechaln nitrogen of lysines and arginines are 
colored blue. D. Tok Pol has a striking pairing 
of oppositely charged residues not seen in 
RB69 pol. A representation of D. Tok Pol as a 
worm is included for orientation. 
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Figure 7 
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The N-terminal domain of D. Tok Pol. (a) Structural conservation of 
RNA-binding domains: structures of RNA-binding domains from 1 HA1 
(domains A and B), 1 PYS, 1 RIS. and 1 URN (molecule 2) have been 
aligned by superimposing (LSQMAN, SUPERPOSE) D. Tok Pol 
residues 40-1 10, which represent four strands @4, £5, P5, 06) and 
two helices (ccA and aB). The N-terminal domain of D. Tok Pol is showa 
A color gradient is used to depict the average rmsd in Ca positions for 
the family of superimposed structures, ranging from blue (1 .0-1 .5 A) to 
white (> 4.0 A). Certain aromatic residues In D. Tok Pol (white) are 
shown; these represent a potential RNA-binding surface. This view is 
rotated by approximately 1 80* from that in Figure 1. (b) An RNA stem- 
loop from the U1 A-RNA complex (PDB code 1 URN, [53]) modeled 
onto the N-terminal domain of D. Tok Pol The model was generated by 
superimposing the U1 A RBD onto the N-terminal domain of D. Tok Pol 
using the conserved structural elements. The RNA is drawn In blue with 
the sugar-phosphate backbone represented as a worm and the bases 
in ball and stick representation. A partial surface that represents the 
interface between the N-terminal domain of D. Tok Pol and the 
exonuclease domain Is shown in gray. The location of the modeled RNA 
relative to the polymerase active site is depicted by marking the position 



of residue Y494. The location (derived after superposition) of the 
guanosine monophosphate (GMP) molecule bound to the 'incomplete' 
RBD of RB69 Pol, drawn in light green, nearly overlaps with the 
positions of the bases of the modeled RNA stem-loop, (c) Structural 
and primary sequence alignment of RNA-binding domains. Sequence 
alignment of the N-terminal domains from D. Tok Pol and RB69 Pol 
(Incomplete domain) and the RBDs from 1 HA1 (domains A and B), 
1 PYS, 1 RIS, 1 URN (molecule 2) superimposed as in Figure 7a. 
Alignments of the N-terminal domain of D. Tok Pol against DNA 
polymerase 8 and e were obtained using CLUSTALX (54], using its 
default parameters. The conserved primary sequence motifs RNP1 and 
RNP2 are boxed. The alignment is colored by sequence similarity (1 5%, 
white to 75%, green) calculated by averaging the similarity scores at 
each position of all possible pairs of sequences (DJ, unpublished 
software). Equivalence of non Identical residues was established by use 
of the BLOSUM62 amino acid substitution matrix [55]. Secondary 
structural elements corresponding to the N-terminal domain of 
D. Tok Pol are represented (pink) with helices as cylinders, strands as 
arrows, and other as thin lines. Numbering of residues and naming of 
secondary structural elements is that of D. Tok Pol. 



structure of this domain using DALI [34] (80-90 residues) found in RNA-binding proteins of 
(http://www.embl-ebi.ac.uk/dali/) revealed a previously prokaryotes, archaea, and eukaryotes (reviewed in [15]). 
unsuspected similarity to RBDs. RBDs are small modules These modules adopt a conserved PaPP<xP architecture 
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and bind to single-stranded RNA. Two conserved 
sequence motifs, referred to as RNP1 (ribonucleoprotein 
1) and RNP2, provide aromatic and charged residues that 
are important for RNA recognition [35] (Figure 7). 

The N-terminal domain of D. Tok Pol can be superim- 
posed closely onto the core secondary structural ele- 
ments of RBDs from the U1A spliceosomal protein [35], 
ribosomal protein S6 [36], the heterogeneous ribonucleo- 
protein (hnRNP) proteins (two RBD domains) [37,38] 
and the anticodon-binding domain from 7*. thermophilus 
phenylalanyl-tRNA synthetase [39]. The rmsds in Ca 
positions for these superpositions are in the range of 
0.5-2.0 A (Figure 7a). Differences between the struc- 
tures of the loops in the N-terminal domain of 
D. Tok Pol and those of the RNA-binding domains are 
within the range of structural variation seen in the 
various RNA-binding domains. 

There is no evidence at present to suggest that the N-ter- 
minal domain of D. Tok Pol binds RNA. However, com- 
parison with the structures of RNA complexes of 
RNA-binding domains shows that the N-terminal domain 
might in fact be a functional RNA-binding domain 
(Figure 7). In particular, three aromatic residues in the 
N-terminal domain (Tyr37, Tyr39 and Tyr86) could inter- 
act with RNA bases in a manner similar to that seen in 
crystal structures of RNA bound to RNA-binding domains 
[35] (Figure 7). Interestingly, these residues are located 
near the position of a guanosine triphosphate molecule 
that is found bound to the N-terminal domain of RB69 Pol 
[1] (Figure 7b). The DNA polymerases from bacterio- 
phage T4 and its distant relative bacteriophage RB69 bind 
specifically to the ribsome-binding site of their own 
mRNA (messenger RNA), repressing its translation 
[40-42]. The N-terminal domains of T4 Pol and RB69 Pol 
are smaller than that of D. Tok Pol. In the RB69 Pol struc- 
ture, the N-terminal domain seems to form an 'incom- 
plete' RNA-binding domain (Figure 7c). 

There is no significant overall sequence similarity 
between the N-terminal domain of D. Tok Pol and 
RNA-binding domains, which is why the presence of 
this fold was not recognized previously (Figure 7c). 
Comparison of the sequences of other archaebacterial 
DNA polymerases and human polymerases 8 and £ sug- 
gests that a corresponding structural element is likely to 
be found in these polymerases as well (Figure 7c). The 
sequence alignment in this region is unambiguous for 
the archaebacterial DNA polymerases. For eukaryotic 
polymerases the alignment is less certain, but it seems to 
conserve the essential aromatic character of the RNP 
motifs (Figure 7c). Confirmation of the presence of 
these domains along with their ability to bind RNA, and 
their precise role in eukaryotic DNA synthesis awaits 
future structural and functional studies. 



Biological implications 

The structure of the DNA polymerase from the archae- 
bacterium Desulfiirococcus strain Tok, D. Tok Pol, 
reveals a strong similarity to the DNA polymerase from 
bacteriophage RB69. It also reveals the presence of an 
N-terminal domain that has structural similarity to RNA- 
binding domains from the U1A spliceosomal protein, 
ribosomal protein S6, the hnRNP proteins and the anti- 
codon-binding domain from T. thermophilus pheny- 
lalanyl-tRNA synthetase. Although the structure in the 
immediate vicinity of the central catalytic region of the 
polymerase domain closely resembles that of Pol I type 
DNA polymerases, the overall architecture of D. Tok Pol 
and the placement of the exonuclease domain is strikingly 
different. The similarity between D. Tok Pol and RB69 
Pol suggests that these two structures are representative 
of a common Pol II polymerase fold. Members of this 
family carry out chromosomal DNA replication in 
eukaryotes, including humans, and yet there is no struc- 
tural information available for any eukaryotic member of 
this family. While this manuscript was being prepared, the 
structure of another archaebacterial DNA polymerase, 
that from the organism Thermococcus gorgonarius has 
been reported [56]. The D. Tok Pol structure reported 
here, along with the RB69 Pol structure and the structure 
of the Thermococcus gorgonarius DNA polymerase, 
should now make it possible to generate reliable structural 
models for eukaryotic DNA polymerases. 

Materials and methods 

Protein expression and purification 

The D. Tok Pol bacterial expression vector and partial amino acid 
sequence were generous gifts of Life Technology Corporation. Conve- 
nient and reproducible protein expression was achieved by the cloning 
the D. Tok Pol gene into the Pet30 plasmld (Novagen). Determination 
of the amino acid sequence of the polymerase was completed using 
this construct. D. Tok Pol was purified by rysing b'omass prepared from 
the above expression systems in a French pressure cell (Avestin). 
D. Tok Pol precipitated by incubation of the soluble fraction at 80°C for 
30 min was further purified by ion-exchange (High-Q. Bio-Rad) and gel- 
filtration (Superdex-200, Pharmacia) chromatography. Purified protein 
was concentrated to 15mg/ml by ultrafiltration (Millipore) In 40 mM 
TRIS-HCI, (pH = 7.4), 50 mM (NH J 2 S0 4 for crystallization trials 

Crystallization, cryostabilizauon, and heavy-metal 
derivatization 

Crystals of D. Tok Pol (maximum dimensions: 200umx150umx 
100 urn) were prepared from 100 mM TRIS-HCI (pH = 8.6), 10 mM 
MgS0 4 , 200 mM (NH^SO^ 20% (v/v) 2,4 methyl pentane diol (MPD). 
1 1 % (w/v) PEG4K, 10 mM dithiothreitol by vapor diffusion at 20°C. These 
crystals were cryostabiQzed In 100mM TRIS-HCI (pH =* 8.6), 10mM 
MgS0 4 , 200 mM L^SO* 20% v/v MPD, 1 3% w/v PEG4K for 30 min and 
when shock-cooled in freshly thawed liquid propane (-180*C), diffracted 
synchrotron wiggler radiation (A1 beamBne, Cornell High Energy Synchro- 
tron Source) to Bragg spacings of 2.4 A. D. Tok Pol crystallized in space 
group P2 1 2 1 2 1 with ceO parameters (Native I: a = 64.8 A, b = 107.6 A, 
c = 153.2 A,a=9Cr\f} = 90 # ,y= 90*). V M calculations suggest that there 
is one molecule per asymmetric unit with high solvent content Native data 
sets recorded under these conditions resulted In unacceptabry high non- 
isomorphism between frozen samples. Substitution of PEG400 for MPD in 
the crystallization and stabilization media resolved this problem and 
allowed structure determination by multiple isornorphous replacement 
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(MIR) (Native II: a = 66.1 A, b = 107.6 A, c - 1 55.9 A, a - 90', 
p = 90°, 7 = 90'). Heavy-metal derivatives were obtained by soaking 
Native II crystals in stabilizing solution containing 10mM heavy-atom 
compound for 24 h. 

Data collection and phase determination 
X-ray diffraction data sets from a set of shock-cooled native and iso- 
morphous heavy-atom derivatives were recorded at the Cornell High 
Energy Synchrotron Source (CHESS) beamline A1 (\ * 0.908). Data 
from Native I crystals (prepared with MPD) extended to a Bragg 
spacing of 2.4 A with an = 4.6%. MIR analysis was conducted on 
Native II crystals (prepared with PEG400), which yielded data to 
beyond 2.6 A. X-ray diffraction data were indexed, integrated and 
scaled using the HKL package [43]. 

The positions of heavy atoms were located manually by inspection of 
difference Patterson maps and checked by cross-phased difference 
Fourier maps. Experimental phases were calculated using these sites 
with the program SHARP [17]. In our hands, higher quality electron 
density maps were obtained by performing individual single isomor- 
phous replacement (SIR) calculations in SHARP and combining the 
individual SIR phases sets using the program SIGMAA [19,44]. The 
experimental phases were improved and extended by solvent flipping 
and negative-density truncation as implemented in SOLOMON. This 
procedure (SHARP/SOLOMON) yielded electron-density maps of suf- 
ficient quality to allow the entire D. Tok Pol polypeptide to be traced 
unambiguously. This map was dramatically improved over a map calcu- 
lated with MLPHARE/SOLOMON [19]. 

Model building and refinement 

The initial molecular model was built into a 3.0 A electron-density map 
using the interactive molecular graphics program O [45]. Model refine- 
ment was carried by conjugate gradient minimization, torsion-angle 
dynamics, and tightly constrained atomic temperature factor refinement 
in the program CNS [20]. Refinement against the 2.6 A Native II data 
set was interspersed with manual rebuilding of the model against 
o A -weighted electron-density maps using (2|F 0 |— |F C |) and (|F 0 |-|F C |) 
coefficients calculated by averaging structure factors of ten models 
resulting from multiple torsion angle dynamics runs [46]. The original 
electron-density map remained a useful guide throughout the rebuilding 
process. The progress of the refinement was monitored by reductions 
in R^ (10% of the recorded reflections) [47]. Against the Native II 
data set the model was refined to an R^ = 29.5% and R^o^ 
« 24.2%. The refinement was continued against the 2.4 A data Native 1 
data set A rigid-body search in CNS with the 2.6 A model yielded a 
clear solution that was refined as above. The final model for Native I 
was refined to an R free =29.9% and R,,^ = 25.3%, and the final 
model contains residues 1-756 with three disordered regions 
(386-389, 665-676, 757, 772). The Native II model contains 6030 
non-solvent protein atoms. 4 sulfate tons, 2 magnesium ions, and 116 
water molecules. The Native I model contains 5992 non-solvent protein 
atoms, 9 sulfate ions, 2 magnesium Ions, and 106 water molecules. 
Model geometry was analyzed using the program PROCHECK [48]. 
Both models have no outliers in the Ramachandran plot with over 80% 
of the residues in the most-favored region. 

Figure preparation 

Figures were composed in programs BOBSCRIPT v1.0 [49], GRASP 
v1.25 [50] and RIBBONS v3.00 [51], with renderings done In 
POVRAY v3.1e (http://www.povray.org). Figures 5 and 7c were com- 
posed using ALSCRIPT [52]. 

Accession numbers 

Coordinates have been deposited with the Research Collaboratory for 
Structural Bioinformatics under the accession code 1QQC. 
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The 2.25 A resolution crystal structure of a pol a family (family B) DNA 
polymerase from the hyperthermophilic marine archaeon Thermococcus 
sp. 9°N-7 (9°N-7 pol) provides new insight into the mechanism of pol a 
family polymerases that include essentially all of the eukaryotic replica- 
tive and viral DNA polymerases. The structure is folded into NH 2 - 
tenrunal, editing 3'-5' exonuclease, and polymerase domains that are 
topologically similar to the two other known pol a family structures 
(bacteriophage RB69 and the recently determined Thermococcus gorgon- 
arius), but differ in their relative orientation and conformation. 

The 9°N-7 polymerase domain structure is reminiscent of the "closed" 
conformation characteristic of ternary complexes of the pol I polymerase 
family obtained in the presence of their dNTP and DNA substrates. In the 
apo-9°N-7 structure, this conformation appears to be stabilized by an ion 
pair. Thus far, the other apo-pol a structures that have been determined 
adopt open conformations. These results therefore suggest that the pol a 
polymerases undergo a series of conformational transitions during the cata- 
lytic cycle similar to those proposed for the pol I family. Furthermore, com- 
parison of the orientations of the fingers and exonuclease (sub)domains 
relative to the palm subdomain that contains the pol active site suggests 
that the exonuclease domain and the fingers subdomain of the polymerase 
can move as a unit and may do so as part of the catalytic cycle. This 
provides a possible structural explanation for the interdependence of 
polymerization and editing exonuclease activities unique to pol a family 
polymerases. 

We suggest that the NH 2 -terrninal domain of 9°N-7 pol may be structu- 
rally related to an RNA-binding motif, which appears to be conserved 
among archaeal polymerases. The presence of such a putative RNA- 
binding domain suggests a mechanism for the observed autoregulation of 
bacteriophage T4 DNA polymerase synthesis by binding to its own 
mRNA. Furthermore, conservation of this domain could indicate that such 
regulation of pol expression may be a characteristic of archaea. Comparion 
of the 9°N-7 pol structure to its mesostable homolog from bacteriophage 
RB69 suggests that thermostability is achieved by shortening loops, 
forming two disulfide bridges, and increasing electrostatic interactions at 
subdomain interfaces. 
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Introduction 

DNA polymerases catalyze the template-directed 
addition of nucleotides onto the 3'-OH group of the 
DNA primer terminus. These enzymes replicate 
DNA with the required accuracy essential for geno- 
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mic stability, but generate sufficient mutations to 
stimulate and maintain evolution. Unlike 
Eucarya and Bacteria, relatively little is known 
about DNA replication in Archaea (Perler et al, 
1996), one of the three major evolutionary lineages 
of life (Woese et al, 1990). Archea play a significant 
role in the biosphere, accounting for up to 30% of 
the biomass in certain Antarctic waters (De Long 
et al, 1994), and exhibit much greater diversity than 
had originally been suspected (Barns et al, 1996). 
Many characterized archaeal species are adapted to 
live in environments of extreme temperature, press- 
ure, salinity, and/or pH such as hydrothermal 
vents, and hot springs (Rees & Adams, 1995). 

Although archaeal cells share many morphologi- 
cal features with Bacteria, archaeal proteins 
involved in gene expression including DNA replica- 
tion, transcription, and translation have been found 
to be similar to those from Eucarya (Edgell & 
DoolitUe, 1997; Bult et al, 1996). In particular, most 
of the archaeal DNA polymerases that have been 
sequenced belong to the a-like polymerase family 
(family B) that includes essentially all the eukaryotic 
replicative and viral DNA pols (Braithwaite & Ito, 
1993; Edgell etal, 1997). 

Crystal structures exist for DNA pols from each 
of four families: pol I (family A), pol oc (family B), 
pol P (family X) and reverse transcriptase (reviewed 
by Joyce & Steitz, 1994; Doublie et al, 1999). 
Although pols from different families are structu- 
rally quite diverse, several common features have 
emerged. The pol domain from each resembles a 
right hand and may be further divided into palm, 
fingers, and thumb subdomains, as was originally 
described for the large fragment of Escherichia coli 
pol I (Klenow fragment) (Ollis et al, 1985). All poly- 
merases appear to share the same mechanism for 
nucleotidyl transfer involving two divalent metal 
ions (reviewed by Bautigam & Steitz, 1998). In 
addition, based on structures containing DNA and 
dNTP bound to pols from pol I, pol P, and reverse 
transcriptase families, a conformational change in 
the fingers subdomain from an open to a closed 
conformation is proposed to occur during the cata- 
lytic cycle (reviewed by Doublie et al,, 1999). 

The pol a family polymerases are of medical 
importance as targets for development of antiviral 
and anticancer therapeutics. For example, human 
pol a is a target in the treatment of acute myelogen- 
ous leukemia and chronic lymphocytic leukemia 
(Keating et al, 1982; Robertson & Plunkett, 1993) 
and a variety of nucleotide analogs with antitumor 
activity inhibit strand elongation by pol a (Huang 
& Plunkett, 1995; Gandhi & Plunkett, 1995). Fur- 
thermore, polymerases, particularly those that are 
thermostable, have a number of critical biotechnolo- 
gical applications ranging from PCR to cloning and 
DNA sequencing. Despite their biological, medical 
and biotechnological importance, the pol a class of 
polymerases has not been structurally as well 
characterized as other DNA polymerase families. 

Here we report the 2.25 A resolution crystal 
structure of a pol a family DNA polymerase from 



the hyperthermophilic marine archaeon Thermo- 
coccus sp. 9°N-7 (9°N-7 pol). Thermoccocus sp. 9°N- 
7 was isolated from a hydrothermal vent at 9° N 
latitude off the East Pacific Rise (Southworth et al, 
1996). The structure is folded into NH 2 -terminal, 
editing 3'-5' exonuclease, and polymerase domains 
that are topologically similar to the two other 
known pol a family structures (bacteriophage 
RB69 (Wang et al, 1997) and the recently deter- 
mined Thermococcus gorgonarius (Tgo) (Hopfher 
et al, 1999), but differ in their relative orientation 
and conformation. 

The pol domain structure is reminiscent of the 
"closed" conformation characteristic of ternary 
complexes of the pol I polymerase family obtained 
in the presence of their dNTP and DNA substrates. 
In the apo-9°N-7 structure, this conformation 
appears to be stabilized by an ion pair. Thus far, 
the two other apo-pol a structures that have been 
determined adopt open conformations. These 
results therefore suggest that the pol a polymerases 
undergo a series of conformational transitions 
during the catalytic cycle similar to those proposed 
for the pol I family. Furthermore, comparison of 
the orientations of the fingers and exonuclease 
domains relative to the palm subdomain that 
contains the pol active site suggests that the 
exonuclease domain and the fingers subdomain of 
the polymerase can move as a unit, and may do so 
as part of the catalytic cycle. This provides a poss- 
ible structural explanation for the interdependence 
of polymerization and editing exonuclease 
activities unique to pol a family polymerases. 

We suggest that the NH 2 -terminal domain of 
9°N-7 pol is structurally homologous to the 
Pappap RNA-binding motif with an exposed patch 
of aromatic amino acid residues. Bacteriophage T4 
DNA pol, which is homologous to 9°N-7 pol, is 
known to bind its own mRNA and repress its own 
synthesis. The homology relationships to the RNA- 
binding motif suggest a structural basis for this 
regulatory mechanism. Furthermore, the conserva- 
tion of this domain in other archaeal pols suggests 
that such autogenous regulation of pol expression 
may be general for archaea. 

Results and Discussion 

Crystal structure of Thermococcus sp. 
9°N-7 pol 

The structure of the full-length, 775-residue 
enzyme (bearing the double mutation D141A and 
D143A) was determined using the multiple isomor- 
phous replacement method to a resolution of 
2.25 A. The current model has an R-factor of 
23.9% (£^ = 30.8%) (Table 1). A Ramachandran 
plot of the model shows 86.8% of the residues in 
the most favored region and the remainder in 
additional allowed regions (12.4%) and generously 
allowed regions (0.8%). A total of 37 residues are 
not traced in the model and lie in regions of poorly 
defined electron density. The first of these gaps 
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occurs at the bottom of the palm domain (residues 
568-575), and the remainder are within the thumb 
region that is frequently observed to be partially 
disordered in apo polymerase structures, as is also 
the case here (e.g. Ollis et al, 1985; Kiefer et al, 
1997). Although no disulfide bridges were included 
in the refinement, four Cys residues showed anom- 
alous peaks in a difference Fourier map and side- 
chain distances and angles consistent with two 
disulfide bridges (Cys428:442, Cys506:509). 

The structure of 9°N-7 pol reveals features com- 
mon to all DNA pol structures as well as those that 
may be unique to archaeal pols. The overall shape 
of the enzyme can be described as a disc with a 
central hole that is folded into NH 2 -terminal, 3'-5' 
exonuclease, and polymerase domains (Figure 1(a) 
and (b)). like all other pols of known structure, the 
pol domain resembles a right hand and may be 
further divided into palm, fingers, and thumb sub- 
domains, as was originally described for the large 
fragment of E. coli pol I (Klenow fragment) (Ollis 
et al, 1985). 9°N-7 pol is similar in structure to the 
pol a family polymerase from the mesos table bac- 
teriophage RB69 (RB69 pol) (Wang et al, 1997), 
although a number of these (sub)domains are 
shorter than in RB69 pol (Figure 1(c)). Nearly all 
these sequence length differences are attributable 
to loop segments that are fewer and shorter in the 
hyperthermostable 9°N-7. As was first observed in 
the RB69 pol structure (Wang et al, 1997), the 3'-5' 
exonuclease domain lies on the opposite side of the 
palm in comparison to pol I family polymerases. 
This domain arrangement is also seen in 9°N-7 pol 
and in Tgo pol (Hopfner et al, 1999), indicating that 
this result is likely to be general for the pol a 
family. The structural similarity between 9°N-7 
and RB69 pols is significant given the low sequence 
identity (<20%) in all but the active-site (palm) 
region, where sequence identity is 42% (Figure 2). 
Similar results hold for sequence alignments 
between 9°N-7 and human pol a. 

NH 2 -term!nal domain 

Many of the members of the pol a polymerase 
family, including archaeal pols, bacteriophage T4 
and RB69 DNA pols, have an NH 2 -terminal 
domain that is not observed in the pol I family. T4 
pol is known to control its synthesis in vivo by a 
mechanism of autogenous regulation (Tuerk et al, 
1990). The mRNA-binding activity has been 
located to within the first 100 residues of the pol 
(Wang et al, 1996), but the structure of a fragment 
comprising residues 1-388 of T4 pol failed to 
suggest a structural basis for RNA binding (Wang 
et al, 1996). Here, we note that certain structural 
similarities between the homologous region in the 
9°N-7 pol and the U1A RNA-binding protein may 
provide a rationale for RNA binding by T4 pol. 

The NH 2 -terminal domain of 9°N-7 pol can be 
considered as three modules based on compactness 
of folding (Figure 3(a)). The first module comprises 
residues 1-31, a three-stranded (3-sheet that inter- 



acts extensively with the 3'-5' exonuclease domain 
via pi^ominantly electrostatic interactions. Resi- 
dues 32-36 act as a flexible linker connecting the 
first module to the second (residues 37-123). The 
third module comprises residues 338-372. 

The second module is folded into a PotPPaP 
motif, with two short P-strands, 5 and 6, inserted 
between the second and third elements. This motif 
occurs in a variety of proteins, and forms the basis 
for the most prevalent RNA binding motif, the 
RNA recognition motif (RRM). The RRM is present 
in the RNA-binding domains of hnRNP Al, spliceo- 
somal protein U1A and U2B", and the sex lethal 
protein (Burd & Dreyfuss, 1994). Although an align- 
ment of the NH2-tenninal domains of archaeal pols 
(Figure 3(b)), together with T4 and RB69 pols, 
shows that they lack the RNP1 and RNP2 sequence 
motifs that characterize the RRM (Burd & Dreyfuss, 
1994), a number of highly conserved and invariant 
residues nevertheless emerges. Most of these resi- 
dues fall in a cluster on the surface of the NH2- 
terminal domains of 9°N-7 and RB69 pols which 
therefore could mark the location of an RNA bind- 
ing site atop the p-sheet platform on the face away 
from helix A (Figure 3(c)). 

Both a sequence alignment (Figure 3(b)) and a 
structural comparison (Figure 3(c)) reveal that T4 
and RB69 pols lack helix A and strand 7 of the 
PappaP motif, perhaps explaining why no sugges- 
tive structural homologies to RNA-binding folds 
could be identified (Wang et al, 1996, 1997). 

Experiments are needed to determine whether 
the NH 2 -terminal domain of 9°N-7 pol binds RNA. 
Although the PapPotP motif occurs in proteins that 
are not thought to interact with RNA (Burd & 
Dreyfuss, 1994), we find its presence in the NH2- 
terminal domain of 9°N-7 pol, in a region known to 
bind RNA in T4 pol (Wang et al, 1996), to be highly 
suggestive of this. RNA-binding capability could 
hold for other archaeal pols as well, since sequence 
augment of NH 2 -terminal domain (Figure 3(b)) 
suggests that they share the PappaP motif. 

We further speculate that just as T4 pol binds its 
mRNA to down-regulate its own synthesis, such 
autogenous regulation of pol expression might 
occur in archaea. Autogenous gene regulation is 
well documented in bacteria, and has at least one 
precedent in archaea. It has been identified in 
the synthesis of the MvaLl ribosomal protein of 
Methanococcus vanielii (Hanner et al, 1994), and 
postulated for a ribosomal gene cluster from the 
halophile Halobacterium cutirubrum (Shimrnin & 
Dennis, 1989). It is interesting that there is no struc- 
tural evidence that such regulation extends to 
eukaryotes, as human pol a shows no significant 
sequence homology to the NH 2 -terminal sequences 
aligned in Figure 3(b). 

3'-5' Exonuclease domain 

This domain is responsible for binding single- 
stranded DNA and excising mismatched bases in 
the elongated primer strand. The structure 
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Figure 1. Structure of the Thermococcus sp. 9°N-7 DNA polymerase. The NH 2 -terminal and 3'-5' exonudease 
domains are colored yellow and green, respectively. The polymerase domain is divided into palm (brown), thumb 
(red), and fingers (blue) subdomains. Three highly conserved carboxylate groups (D404, D540, D542) mark the poly- 
merase active site, (a) Stereoview of the C a trace. Every 40th C is numbered. Broken lines indicate disordered regions 
of the protein, (b) Ribbon diagram with secondary structure elements defined according to DSSP (Kabsch & Sander, 
1983). NH 2 -terrriinal domain: 1, MO; 2, 13-22; 3, 25-31; 4, 37-42; A, 48-51; 5, 55-58; 6, 61-64; 7, 67-75; 8, 78-86; B, 92- 
101; 9, 106-110; C, 116-123; J, 341-344; K, 349-363. 3'-5' exonudease domain: 10, 137-144; 11, 157-163; 12, 168-172; 13, 
181-183; D, 187-201; 14, 205-208; E, 215-225; 15, 240-244; 16, 247-251; 17, 256-259; F, 260-266; G, 275-283; H, 292-300; 
I, 305-337. Polymerase domain: L, 374-379; 18, 397-404; M, 408-415; 19, 431-433; 20, 440-442; N, 448-468; O, 473-498; 
P, 507-532; 21, 535-539; 22, 543-547; Q, 553-567; 23, 578-590; 24, 593-598; 25, 603-606; R, 617-633; S, 636-651 (648-651 
disordered); T, 657-660 (disordered); 26, 662-665; U, 677-688; 27, 698-703; 28, 714-716; V, 731-734; W, 742-746. (c) Sche- 
matic comparing the (sub)domains of Thermococcus sp. 9°N-7 7and bacteriophage RB69 DNA polymerases. The 
domain boundaries for 9°N-7 pol were determined based upon a structure-based sequence alignment with RB69 pol 
(Figure 2) as defined for the RB69 pol (Wang et ah, 1997). 
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9»-7 136 TW^^»B-IClEJ^3«aBffi^ 1<3 181 Vl**^^ J42 
BB69 109 RVJ^IlVM^DCr PgSg^roAl^ 140 188 tl^FWjn^^ ffclKWITGl UOUJ»H» 



9M-7 251 > ^ TtmBWHtaeiBixx^^ HI 

rb69 264 T^svurtiniOTrsraiQ^w^ smygtig^ij^^ 351 




Figure 2. A three-way partial sequence alignment of Thermococcus sp. 9°N-7 pol (9N-7), RB69 pol (RB69), and 
human pol a (HPOL). Dashes indicate gaps in the alignment, and segments not aligned are represented as amino 
acid residue spans within brackets. Ticks mark every 10 spaces. The 9°N-7 and RB69 pol alignment is based upon the 
crystal structures. The HPOL and RB69 alignment is from Wang et al. (1997), except for a few short segments 
assigned based upon the three sequences shown here. Indicated below the sequences and boxed in yellow are consen- 
sus motifs in the exonuclease (Blanco et al, 1992) and polymerase (Wong et al, 1988) domains. The secondary struc- 
ture elements in 9°N-7 pol, as defined by DSSP, are given above the sequences. The structural elements are colored 
according to the scheme described in legend to Figure 1. Shown in purple in the 9°N-7 pol sequence are the archaeal 
polymerase motifs described by Edgell et al (1997)- Residues within the polymerase domain that are invariant in the 
three sequences blue boxes; residues discussed in the section on dNTP binding, blue asterists. The two disulfide 
bridges in the palm (C428:C442, C506:C509) are shown schematically. 



reported here is that of a mutant of 9°N-7 pol lack- 
ing detectable exonuclease activity which was 
engineered to prevent degradation of DNA sub- 
strates during subsequent co-crystallization exper- 
iments. This 9°N-7exo" pol was obtained by 
making two point mutations (D141A, E143A) in 
the Exo I (DxE) motif highly conserved among the 
3'-5' exonuclease domains of many DNA pols 
(Derbyshire et al, 1995; Blanco et al, 1992). In the 
Klenow fragment (KF) of E. colt DNA pol I, these 
residues (D355, E357) are responsible for binding 
the catalytic metals and for hydrogen-bonding 
with the 3'-OH of the terminal deoxynucleotide of 
the substrate DNA (Beese & Steitz, 1991). 

Aside from loop segments that are shorter than 
those observed in RB69 pol (see below), the top- 
ology of the exonuclease domain in 9°N-7 pol is 
very similar to that of RB69 pol. The domains 
superimpose in the central P-sheet, containing the 
active site, with a root mean square deviation 
(rmsd) of 0.95 A (35 C a atoms). The metal-binding 
residues not mutated in 9°N-7exo _ pol, D215 and 



D315, superimpose almost exactly on the corre- 
sponding RB69 pol residues (D222, D327). 

It is now possible to assign a structural context 
to the four archaeal sequence motifs identified by 
Edgell et al (1997). Three of the regions (A-C) lie 
within the exonuclease domain (Figure 2). Motif A 
forms part of the central p-sheet containing 
the active site; B, part of a solvent-exposed loop; 
and C, part of a five-stranded P-sheet nearly 
perpendicular to the central P-sheet. The fourth 
motif resides in the palm (see below). 

Pol domain 

This domain is responsible for the template- 
directed polymerization of dNTPs onto the grow- 
ing primer strand of duplex DNA. Like other poly- 
merases of known structure, the pol domain can be 
further divided into palm, fingers, and thumb sub- 
domains. While the structure of the thumb of 
9°N-7 and RB69 pols are highly similar, differences 
exist in the palm and fingers. Some of these differ- 
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ences correspond to features that appear unique to 
archaeal pols, while others support a hypothesis 
that a conformational change occurs in the ringers 
as part of the catalytic cycle. 

Palm subdomain 

The palm, which contains the active site for 
polymerization, shows a high degree of structural 
similarity to the palm subdomain of other DNA 
polymerases. It is as structurally similar to pol I 
family polymerases as to those of the pol a family. 
Its rms deviation from RB69 pol around the active 
site (blue region in Figure 4(b)) is 0.84 A (26 C a 
atoms). Together with the Tgo pol structure 
(Hopfher et al, 1999), this structure confirms for 
archaea the conservation of a common catalytic 
core. A significant difference between the palm 
subdomains in 9°N-7 and RB69 pols are the two 
disulfide bridges present in 9°N-7 pol, one joining 
Cys428 and 442 and another joining Cys506 and 
509 (Figure 4(b)). Both the shortened loops and at 
least one disulfide bridge appear common to 
archaeal pols (see above). Indeed, the region con- 
taining one of the Cys residue in a disulfide bridge 
(C442) corresponds to the highly conserved archae- 
al motif D (Edgell et al, 1997; Figure 2). The Tgo 
pol structure shows the corresponding Cys resi- 
dues to be "poised" for disulfide formation, but 
still in reduced form. 

Until recently it was believed that all pols share 
a catalytic "triad" of carboxylate residues in the 
active site in the palm (Delarue et al, 1990). Wang 
et al (1997) since recognized that only two of the 
carboxylate residues are invariant. Trie invariant 
carboxylates in 9°N-7 pol are D404 and D542. The 
third member of the triad, present as D540 in 
9°N-7 pol, is not essential: mutation at the corre- 
sponding residue (D1002N) in human pol a retains 
catalytic function (Copeland et al, 1993). D540 in 
9°N-7 pol may nevertheless be involved in binding 
the divalent metals required for catalysis. Mff + is 
normally the optimal metal for human pol a 
activity. The pol a D1002N mutant shows greater 
catalytic efficiency and fidelity with Mn 2+ rather 
than Mg 2 * (Copeland & Wang, 1993). 

D540 in 9°N-7 pol interacts with the hydroxyl 
group of Y538 that is within hydrogen-bonding dis- 
tance to D540. Substitution of this residue to Phe in 
human pol a (Y1000) causes only minor effects on 
catalysis but alters the pol metal affinity akin to the 
pol a D1002 mutation (Copeland & Wang, 1993). It 
seems likely that the hydroxyl moiety of Y538 in 
9°N-7 pol helps to lock D540 in position for Mg 2 *- 
specific binding. Consistent with this function is the 
strict conservation of Y538 among pol a family 
members (Braithwaite & Ito, 1993). 

Fingers subdomain 

The fingers subdomain of 9°N-7 differs in top- 
ology and relative conformation from RB69. The 
fingers of 9°N-7 pol are a simple helix-coil-helix, as 



in Tgo pol (Hopfher et al, 1999), whereas in the fin- 
gers of RB69 pol, the coil region is expanded with 
more secondary structure elements (Figures 2 and 
5). The shorter fingers of 9°N~7 pol are conserved 
among the archaeal pols aligned by Edgell et al 
(1997). It is possible that the fingers of archaeal 
pols define a ininimal functional unit. 

Different positions of the fingers subdomain rela- 
tive to the palm are observed in the 9°N-7 and 
RB69 pol structures (Figure 5(a)). The fingers of 
Tgo pol (Hopfher et al, 1999) show a position inter- 
mediate between that in 9°N-7 and RB69 pols, 
when the palm subdomains of all three enzymes 
are aligned. It is interesting to note that the fingers 
subdomain of polymerases in the pol I family 
adopt different positions during the catalyic cycle 
(reviewed by Doublie et al, 1999). An open pos- 
ition corresponds to that seen in the apoenzyme 
form (Ollis et al, 1985; Kim et al, 1995; Korolev 
et al, 1995; Kiefer et al, 1997) and the form bound 
to duplex DNA (Eom et al, 1996; Kiefer et al, 
1998). A closed conformation has been observed in 
the ternary replication complexes of bacteriophage 
T7 pol Poublie et al, 1998), and Klentaq (Li et al, 
1998) with bound DNA and dNTP. An analogous 
conformational change has been observed in tern- 
ary complexes of human immunodeficiency virus 
reverse transcriptase (Huang et al, 1998) and rat 
pol p (Pelletier et al, 1994). In the closed confor- 
mation the fingers rotate towards the palm to form 
a binding pocket for dNTPs. 

The differences in position of the fingers sub- 
domain in the three pol a family crystal structures 
suggest that the fingers of pol a family pols move 
during catalysis, analogous to that observed for the 
other polymerase families. It is interesting to note 
that if this is the case, there must be a correspond- 
ing movement in the position of the 3'-5' exo- 
nuclease domains not required in the other 
polymerase families as will be discussed below. If 
the position of the fingers in 9°N-7 pol more 
closely approximates a closed conformation, it is 
not clear why they would adopt a position pre- 
viously observed only in ternary complexes with 
bound dNTP and DNA. The fingers of 9°N-7 pol 
may be stabilized in this conformation because of a 
salt-bridge between E578 in the palm and K487 on 
helix O of the fingers. These residues are highly 
conserved among archaeal pols (Edgell et al, 1997) 
and both pol I and pols a families (Braithwaite & 
Ito, 1993). The corresponding salt-bridge does not 
form in polymerases of the pol I family because 
the fingers helix O lies too far from the palm. The 
fingers of Tgo pol, in fact, are rotated slightly away 
from the active site, relative to 9°N-7 pol, such that 
the E578:K487 salt-bridge cannot form. Another 
possible explanation for the difference in finger 
positions are the disulfide bridges present in 9°N-7 
pol but absent in the Tgo pol structure and in pol I 
family structures. At least one of the disulfides 
(Cys428:442) in 9°N-7 pol could be directly 
involved in orienting the fingers relative to the 
palm (Hopfher et al, 1999). 
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Model for DNA and dNTP binding 

Based on the high degree of structural homology 
of the palm subdomains between 9°N-7 and pol I 



family pols, DNA and dNTP substrates from the 
bacteriophage T7 pol ternary complex (Doublie 
et ah, 1998) were modeled into the 9°N-7 pol active 
site. The model shown in Figure 6 provides further 
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Figure 3. The RNA-binding motif in the NH 2 -terminal domain of 9°N-7 pol. (a) Topology diagram of the complete 
^-terminal domain (residues 1-129, 338-372). The RNA-binding motif PappotP known as the RNP recognition motif 
(Burd & Dreyfuss, 1994), is boxed, (b) Sequence alignment of the NH 2 -terminal domains of 9°N-7 pol, RB69 pol, T4 
pol, and archaeal pols. Alignment of 9°N-7 pol and T4/RB69 is based upon the crystal structures, and that of 9°N-7 
pol and the other archaeal polymerases is based upon sequence alignment of 13 sequences among those considered 
by Edgell et al (1997) (data not shown). The archaeal polymerase alignment was performed with the PILEUP algor- 
ithm in the GCG package (University of Wisconsin Genetic Computer Group). Secondary structure elements corre- 
sponding to 9°N-7 pol are given above the sequence. A consensus sequence was derived for the archaeal 
polymerases at those positions where at least 70% of the 13 sequences shared the same residue. Boxed in yellow are 
those residues conserved between the archaeal consensus and both bacteriophage (T4, RB69) sequences. Position 367 
in 9°N-7 pol is starred (see the text for discussion). Abbreviations are as follows: PFU, Pyococcus furiosus; SACD 
Sulfolobus acidocaldarius; MJAN, Methanococcus jannaschii; POCC, Pyrodictium occultum Bl. (c) Ribbons representation 
of the NH 2 -terminal domain of 9°N-7 (left) and RB69 (right) pols. Least-squares C a superposition was performed over 
the region of 9°N-7 pol including strand 4, part of strand 8, and helices B and C, and the domains were separated for 
side-by-side comparison. Shown in green is the Pappocp RNA-binding motif. Charged and aromatic archaeal consen- 
sus residues are shown with green side-chains, and yellow side-chains correspond to the residues boxed in yellow in 
(b). The loop between p strands 7 and 8 in 9°N-7 pol corresponds to the conformationally variable loop 3 in the 
canonical RNP motif (Shamoo et al, 1997). J v 



evidence that the position of the fingers in 9°N-7 
pol more closely approximates a closed confor- 
mation and their position in RB69 pol approxi- 
mates an open conformation. This model of a 
ternary complex for a pol a family polymerase 
places the dNTP within hydrogen-bonding dis- 
tance of residues on the fingers O helix that are 
highly conserved and known by mutagenesis to be 
functionally important. The corresponding residues 
on fingers helix P of the RB69 pol are farther away 
and cannot directly interact with dNTP. 

The model places residues Y409 and Y494 near 
the deoxyribose moiety of the incoming dNTP. 
These residues appear to be functionally analogous 
to E480 and Y526 of T7 pol, which are responsible 
for discriminating between deoxy- and ribonucleo- 
tides (rNTPs). Y409 is invariant among the pol a 
family in the alignment by Braithwaite & Ito (1993) 
and nearly invariant (one exception) among 
archaeal pols aligned by Edgell et al (1997). 
Mutation of the corresponding residue (Y412) 
to Val in an exonuclease-deficient Thermococcus 



litoralis (Vent) pol causes a 200-fold loss of 
discrimination against rNTPs. The aromatic ring 
appears to be the functionally important moiety, 
as mutating Y412 to Phe conserves wild-type 
discrimination levels (Gardner & Jack, 1999). 

Y526 in T7 pol (F762 in Klenow fragment) has 
been dubbed the "ribose selectivity site" (Tabor & 
Richardson, 1995). A Phe residue at this position 
confers selectivity against incorporation of dideox- 
yribonucleotides (ddNTPs), whereas a Tyr residue 
in this position allows efficient incorporation of 
both nucleotide species. The presence of Tyr (Y494) 
in this position in 9°N-7 pol suggests the ability to 
incorporate dideoxynucleotides, as do Vent 
(Gardner & Jack, 1999) and human pol a (Cope- 
land et al, 1992). In fact, Tyr is invariant at this 
position among the archaeal pols aligned by Edgell 
et al (1997), and highly conserved in the pol a 
family aligned by Braithwaite & Ito (1993). 

The model of a ternary complex with dNTP and 
DNA places residues N491 and K487 in hydrogen- 
bonding distance from the triphosphate moiety of 
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Figure 4. Comparisons of 9°N-7 
and RB69 pols in different (sub)do- 
mains to indicate loop segments 
that are shorter in 9°N-7 pol. Least- 
squares C a superposition was per- 
formed over the region in blue, and 
the domains were separated for 
side-by-side comparison. Loop 
regions are shown in magenta and 
their residue endpoints are labeled, 
(a) Comparison of the exonudease 
domains. Indicated with purple 
asterisks are the active site carboxy- 
lates (mutated to Ala in the case of 
the 9°N-7exo~ pol used in this 
study), (b) Comparison of the palm 
domains. The three active-site car- 
boxylate groups are depicted with 
side-chains. 



the incoming dNTP. Both of these residues are 
invariant in the pol a family (Braithwaite & Ito, 
1993), and nearly invariant (one exception) among 
archaeal pols (Edgell et ah, 1997). Mutation of the 
corresponding residues (N494, K490) in Vent (exo-) 
pol severely decreases enzyme activity (Gardner & 
Jack, 1999). 

Concerted domain movement 

The difference in position of the fingers sub- 
domain in 9°N-7 and RB69 pols is part of a larger 
conformational change involving the 3'-5' exo- 
nudease and NH 2 -terrninal domains. Comparing 
these two pol structures shows that in one of the 
pair, an essentially rigid-body rotation has 
occurred involving three of the five (sub)domains. 
This concerted movement affects both the position 
of the fingers relative to the pol active site (open 



versus closed conformation), as well as the position 
of the exonudease active site relative to the pol 
active site. The 9°N-7 and RB69 pol structures may 
approximate different states along the reaction 
pathway corresponding to DNA synthesis and 3'-5' 
exonudeatic proofreading activities. 

When these two polymerases are aligned in the 
palm (the blue region in Figure 4(b)), the exo- 
nudease and fingers are displaced between the 
proteins (Figure 5(a)). If the enzymes are aligned in 
the exonudease domain (see Figure 4(a)), the 
fingers superimpose almost exactly (Figure 5(b)). 
Moving from a palm to an exonuclease-based 
alignment also brings the first module (residues 
1-31) of the NH 2 -terminal domains into identical 
positions (not shown). The joint motion of the first 
NH 2 -termrnal module and the exonudease may 
reflect the need to maintain ionic networks at the 
interface. There are two five-membered ionic net- 
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Figure 5. Least-squares C* super- 
positions of 9°N-7 and RB69 pols 
in the (a) palm subdomain or 
(b) exonudease domain. The 9°N-7 
pol backbone is shown in yellow, 
and its active-site carboxylate 
groups in gold. The RB69 pol back- 
bone is drawn in green, and its 
active-site residues in magenta. The 
central p-sheet of the exonudease 
domain is light blue (9°N-7 pol) or 
dark blue (RB69 pol) to allow 
tracking of the domain motion. The 
precise regions used in the palm 
and exonudease superpositions are 
shown in Figure 4. The NH 2 - term- 
inal domain has been omitted for 
clarity. Arrows in (a) indicate the 
Sirection of fingers and exonudease 
movement when moving from (a) 
to (b). 



works formed between the first module and exo- 
nudease (Figure 7). In addition, a three-membered 
network is formed between the third NH 2 -module 
(R346) and the exonudease (Figure 7). This net- 
work is conserved among nearly all archaeal pols 
(Edgell et al, 1997), but none is present in RB69 
pol. 

Comparison of the Tgo pol structure (Hopfner 
et al, 1999) with that of 9°N-7 and RB69 pols using 
palm and exonuclease-based superpositions gives 
results similar to those in Figure 5, providing 
further support for the notion of a concerted 
domain movement. 

A model was constructed for the RB69 pol 
(Wang et al, 1997) showing how substrate DNA 
could shuttle between the pol and exonudease 
active sites. When 9°N-7 and RB69 pols are aligned 
in the palm, the exonudease active site in the 
former is tilted out and away from the pol active 
site, making it impossible for the DNA to shuttle. 
The exonudease position in RB69, but not that in 
9°N-7 pol, is therefore consistent with an editing 
conformation. It is interesting that this confor- 



mation also means that the fingers are not in 
position to bind dNTP (see above). Taken together, 
these considerations suggest that during the 
replication cycle of family B pols, there is concerted 
movement of the exonudease, NH 2 -terrninal 
domain, and fingers relative to the catalytic region 
of the palm. 

This concerted movement may be the structural 
basis for the functional coupling of polymerase 
and exonudease domains, which is unique to the 
pol a family. In this family it is possible to generate 
site-directed mutations in one domain that exert an 
indirect, negative effect on the other (Reha-Krantz 
& Nonay, 1993; Abdus Sattar et al, 1996). This con- 
trasts with pol I pols like KF, where these activities 
are completely confined to their respective 
domains (Ollis et al, 1985). 

Molecular basis of thermostability 

Thermococcus sp. 9°N-7 grows at temperatures of 
88-90 °C, and its pol has a temperature optimum of 
70-80 °C (Perler et al, 1996). It has a half-life of 6.7 
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E578 



Figure 6. The active site of 9°N-7 
pol and a modeled ternary com- 
plex, (a) Stereoview of the active 
site. Residues with indicated side- 
chains are discussed in the text. 
Hydrogen bonds are as broken 
lines, and the two disulfide bridges 
are shown in violet. K487 in this 
structure is involved in a salt- 
bridge with E578 of the palm, (b) 
Model of a ternary complex of 
9°N-7 pol. For clarity only the 
incoming base and the first 
primer: template base-pair are 
shown. Hydrogen bonds are shown 
as broken lines, and metal ions are 
modeled as green spheres. The 
9°N-7 pol and T7 pol ternary com- 
plex (Doublie et al, 1998) were 
superimposed in the palm (0.55 A 
rmsd for 13 C atoms). The rotamer 
conformation was adjusted for 
D542 and D404 in 9°N-7 pol, and 
the P turn including D542 was 
tilted downward, in a motion ana- 
logous to mat observed between 
the apoenzyme and binary complex 
structures of Bacillus stearothermo- 
philus pol (Kiefer et al, 1997, 1998). 



hours at 95 °C (R.B. Kucera, unpublished results), 
whereas Thermus aquaticus (Taq) DNA pol has a 
half-life of 1.6 hours at 95 °C (Kong et al, 1993). The 
structure of 9°N-7 pol indicates a few key strategies 
for this hyperthermostability, some of which appear 
general to archaeal DNA pols. 

A surprising feature of the 9°N-7 pol is that it 
contains two disulfide bridges (Figures 1(a) and 
6(a)). The potential for the same bridges to form 
was also observed in Tgo pol (Hopfner et al, 1999). 
Although not normally the case in Bacteria or 
Eucarya, an increasing number of cytosolic pro- 
teins with disulfide bridges are being discovered in 
the Archaea (DeDecker et al, 1996; Singleton et al, 
1999). The stabilizing role of disulfide bridges 
has been well documented (Gokhale et al, 1994; 
Cooper et al, 1992). Introduction of disulfide 
bridges therefore appears to be a common strategy 
for archaeal protein stability. 

Alignment of a large number of archaeal pols 
(Edgell et al, 1997) suggests that having at least 
one of these disulfides is important for their 
thermostability. In fact, the two-stranded P-sheet 



containing C442 corresponds to sequence motif D 
in archaeal pols (Edgell et al, 1997). Based on 
whether Cys is present in the corresponding pos- 
itions, all the pols discussed by Edgell et al (1997) 
are predicted to have at least one of the two disul- 
fide bridges seen in 9°N-7 pol, with the exception 
of M. voltae and S. shibatae B3 pols. The mesostabil- 
ity of At voltae pol may be partly caused by a lack 
of disulfide bridges. The S. shibatae B3 pol, like the 
S. solfataricus P2 B3 pol, is highly divergent in 
sequence from other archaeal pols, and it is unclear 
whether either of these functions in vivo (Edgell 
et al, 1997). 

An increased number of salt-bridges relative to 
mesostable homologs is often cited as a determi- 
nant of protein thermostability (DeDecker et al, 
1996; Korndorfer et al, 1995; Chan et al, 1995; 
Hennig et al, 1995). The 9°N-7 pol shows a sub- 
stantial increase in the fraction of charged residues 
participating in salt-bridges (47%) compared with 
RB69 pol (39%). These results are similar to a ther- 
mostability study of Pyrococcus furiosus glutamate 
dehydrogenase (Yip et al, 1995). The authors of 
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Figure 7. The extensive ionic 
networks at the interface of the 
NH 2 -tenrtinal and 3'-5' exonudease 
domains. 



that study found a marked preference for Arg resi- 
dues in the ionic interactions of the thermostable 
enzyme, but no such preference is evident here. 
The same fraction (48 %) of Arg residues is used in 
ionic interactions in both 9°N-7 and RB69 pols, 
whereas a much higher proportion of Glu residues 
participate in salt-bridges in the 9°N-7 pol (53%) 
compared with RB69 pol (33 %). 

The number and distribution of salt-bridges 
within domains does not substantially differ 
between 9°N-7 and RB69 pols. At the interfaces 
between (sub)domains, however, the differences in 
ionic networks are striking. The proportion of ionic 
interactions at interfaces in the 9°N-7 pol (21 %) is 
over twice that in RB69 pol (9 %). The differences lie 
at the interface of the exonuclease domain with the 
NH 2 -tenrtinal domain (Figure 7), and at the inter- 
face of the exonuclease with the thumb, where a 
two and a three-member ionic network occur in 
9°N-7 pol compared with none in RB69 pol (not 
shown). 

Burial of the charged tennini of proteins has 
been cited as another factor that can confer thermo- 
stability (Hennig et al, 1995). The NH 2 -terminal 
methione (Ml) of 9°N-7 pol is stabilized by a 
hydrophobic cluster formed by L135, F327, 1256, 
V205, and L341 while the corresponding residue 
of RB69 pol is completely exposed to solvent. The 
B-factor for the C a of Ml in 9°N-7 pol is 26 A 2 , 



whereas for Ml in RB69 pol, it is 95 A 2 . While 
burial of the N terminus may be important for the 
thermostability of the 9°N-7 pol, the same does not 
hold for the C terminus. The last 25 residues are 
not visible in the electron density, similar to the 
case of RB69 pol. The solvent accessibility of the 
C terminus of these pols may reflect the need for 
this region to interact with a processivity accessory 
protein, which is known to be the case in the T4 
replication complex (Berdis et al, 1996). 

Another common strategy for protein thermo- 
stability is to lower the solvent-accessible surface 
area of the protein and to increase the proportion 
of buried structure (Korndorfer et al, 1995; Chan 
et al, 1995). This translates into a more compact 
structural design. There are at least 12 examples of 
loop segments in RB69 pol that are much shorter 
or absent in 9°N-7 pol. Some of the more striking 
examples are shown in Figure 4. Alignment of 16 
archaeal pols (Edgell et al, 1997) indicates that they 
share practically all of these sequence "deletions". 
The Tgo pol structure also revealed shortened loop 
segments relative to RB69 pol (Hopfner et al, 
1999). Nevertheless, the overall ratio of solvent- 
accessible surface area to volume for both 9°N-7 
and RB69 pols is the same (0.33). Thus, while low- 
ering the surface area to volume ratio is a common 
strategy for thermostability, it is not the primary 
basis for the stability of 9°N-7 pol. 
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Materials and Methods 

Purification, crystallization, and data collection 

Thermococcus sp. 9°N-7 polymerase (wild-type and the 
D141A,D143A exonuclease-deficient mutant) was over- 
expressed and purified as described (Southworth et ah, 
1996). Crystallization, cryoprotection, data collection and 
reduction of native crystals are described (Zhou et ah, 
1998). Derivatives were prepared by soaking native 
crystals in stabilization solution (Zhou et ah, 1998) 
supplemented with 22.7 mM sodium ethylmercurit- 
hiosalicylate (thimerosal) for 11 days (thirnerosal-1), 
3.0 mM I^PtC^ for one hour (PtCl-1), 1.5 mM di-u-iodo- 
bis (emylenediamine)diplatinum (H) nitrate (PIP) for one 
day (PIP-1), or 1.0 mM Baker's mercurial for 50 hours 
(BAHg). These crystals were stepped through stabiliz- 
ation solution containing 8% (five minutes), 16% (five 
minutes), and 30% sucrose (one to five hours). 
Additional derivatives were collected with the improved 
cryoprotection procedure reported (Zhou et ah, 1998) by 
soaking native crystals in 23.0 mM thimerosal for 8.3 
days (thimerosal-2), 3.0 mM K 2 PtQ 4 for seven days 
(PtCl-2), and 1.5 mM PIP for 35 hours (PIP-2). 

Structure determination 

The structure of the D141A,D143A mutant of 9°N-7 
polymerase was determined by the method of multiple 
isomorphous replacement (MIR). A number of native 
and derivative crystals were used to solve the structure 
because of problems with non-isomorphism (Table 1). 
Three native datasets were collected from single crystals. 
NAT-1 was mounted in the liquid nitrogen stream 
directly from cryoprotectant, whereas NAT-2 and -3 
were flash-frozen in liquid nitrogen prior to mounting. 
The crystals belong to space group P2 1 2 1 2, with unit cell 
dimensions of approximately a — 96.1 A, b = 101.1 A, 
c = 112.2 A (for NAT-3). One molecule is present per 
asymmetric unit, giving a solvent content of approxi- 
mately 60%. 

A difference Patterson map of thimerosal-1 was calcu- 
lated using the program FFT in the CCP4 suite (CCP4, 
1994). One heavy-atom site for this derivative was ident- 
ified with the program RSPS (Knight, 1989). This site was 
used to calculate initial phases for NAT-1 at 5 A resol- 
ution using the program MLPHARE (Otwinowski, 1991). 
Difference Fourier synthesis with the initial phases 
revealed three sites for the PtCl-1 derivative. Two more 
sites for this derivative were discovered with the phases 
derived from both thimerosal-1 and PtCl-1. The correct 
handedness of the phasing information from these deriva- 
tives was determined using MLPHARE, and anomalous 
scattering data from the derivatives were included in the 
phase calculation. Three sites for the BAHg derivative 
and four sites for PIP-1 were obtained from difference 
Fourier maps calculated to 5 A resolution. All of these 
heavy-atom sites were included in subsequent phase cal- 
culations with NAT-1. The high-resolution phasing limit 
was extended to 3.5 A. Because of the high solvent con- 
tent in the crystals, use of the solvent-flattening program 
DM (Cowtan, 1994), in combination with histogram 
matching, improved the phases substantially. A polyala- 
nine model was built into the improved electron density 
map of NAT-1 with the program O (Jones & Kjeldgaard, 
1993) and refined in the program X-PLOR (Brunger, 
1992). Phase combination using the program SIGMAA 
(Read, 1986) further improved the map during building 
and refinement. 
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Identification of side-chain densities was possible only 
after collecting a higher-resolution native dataset 
(NAT-2), along with diffraction data for three more 
derivatives obtained under improved cryoprotection 
conditions (thimerosal-2, one site; PtCl-2, four sites; 
PEP-2, five sites). These derivatives were used to calcu- 
late MIR phases of NAT-2 to 3.0 A resolution. Partial 
model phases of NAT-2 were calculated using the 
refined polyalanine model derived from NAT-1. Because 
of significant differences in unit cell dimensions between 
NAT-1 and 2, it was first necessary to subject NAT-2 to 
rigid-body refinement against NAT-1 in X-PLOR. 
Combination of the polyalanine model phases and MIR 
phases with SIGMAA improved the electron density 
map. Model building, refinement, and phase combi- 
nation were reiterated until a complete polyalanine 
model could be built In the final stage of refinement, 
NAT-3 was used to extend the resolution limit to 2.1 A 
and water molecules were added. 



Coordinate files and illustrations 

The Thermococcus sp. 9°N-7 polymerase atomic coordi- 
nates and structure factors have been deposited in the 
RCSB Protein Data Bank under the accession code 
1QHT. The RB69 coordinates used for comparisons in 
this manuscript are those of the orthorhombic crystal 
form (accession code 1WAJ). Figures were prepared 
within the IRIS Showcase program (Silicon Graphics, 
Inc.) entirely (1(b), 2, 3(a) and 3(b)) or with images 
imported from MOLSCRIPT (1(a)) (Priestle, 1991) or 
SETOR (3(c), 4-7) (Evans, 1993). 
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